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NOVEL NUCLEIC ACIDS AND POLYPEPTIDES 

1. TECHNICAL FIELD 

The present invention provides novel polynucleotides and proteins encoded by such 
5 polynucleotides, along with uses for these polynucleotides and proteins, for example in 
therapeutic, diagnostic and research methods. 

2. BACKGROUND 

Technology aimed at the discovery of protein factors (including e.g., cytokines, such as 

1 0 lymphokines, interferons, CSFs, chemokines, and interleukins) has matured rapidly over the past 
decade. The now routine hybridization cloning and expression cloning techniques clone novel 
polynucleotides "directly" in the sense that they rely on information directly related to the 
discovered protein (i.e., partial DNA/arnino acid sequence of the protein in the case of 
hybridization cloning; activity of the protein in the case of expression cloning). More recent 

1 5 "indirect" cloning techniques such as signal sequence cloning, which isolates DNA sequences 
based on the presence of a now well-recognized secretory leader sequence motif, as well as 
various PCR-based or low stringency hybridization-based cloning techniques, have advanced the 
state of the art by making available large numbers of DNA/arnino acid sequences for proteins 
that are known to have biological activity, for example, by virtue of their secreted nature in the 

20 case of leader sequence cloning, by virtue of their cell or tissue source in the case of PCR-based 
techniques, or by virtue of structural similarity to other genes of known biological activity. 

Identified polynucleotide and polypeptide sequences have numerous applications in, for 
example, diagnostics, forensics, gene mapping; identification of mutations responsible for 
genetic disorders or other traits, to assess biodiversity, and to produce many other types of data 

25 and products dependent on DNA and amino acid sequences. 

3. SUMMARY OF THE INVENTION 

The compositions of the present invention include novel isolated polypeptides, novel 
isolated polynucleotides encoding such polypeptides, including recombinant DNA molecules, 
30 cloned genes or degenerate variants thereof, especially naturally occurring variants such as allelic 
variants, antisense polynucleotide molecules, and antibodies that specifically recognize one or more 
epitopes present on such polypeptides, as well as hybridomas producing such antibodies. 

The compositions of the present invention additionally include vectors, including expression 
vectors, containing the polynucleotides of the invention, cells genetically engineered to contain such 
3 5 polynucleotides and cells genetically engineered to express such polynucleotides. 
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The present invention relates to a collection or library of at least one novel nucleic acid 
sequence assembled from expressed sequence tags (ESTs) isolated mainly by sequencing by 
hybridization (SBH), and in some cases, sequences obtained from one or more public databases. 
The invention relates also to the proteins encoded by such polynucleotides, along with therapeutic, 

5 diagnostic and research utilities for these polynucleotides and proteins. These nucleic acid 

sequences are designated as SEQ ID NO: 1-1350. The polypeptides sequences are designated SEQ 
ID NO: 1351-2700. The nucleic acids and polypeptides are provided in the Sequence Listing. In 
the nucleic acids provided in the Sequence Listing, A is adenosine; C is cytosine; G is guanine; T is 
thymine; and N is any of the four bases. In the amino acids provided in the Sequence Listing, * 

1 0 corresponds to the stop codon. 

The nucleic acid sequences of the present invention also include, nucleic acid sequences that 
hybridize to the complement of SEQ ID NO: 1 - 1 350 under stringent hybridization conditions; 
nucleic acid sequences which are allelic variants or species homologues of any of the nucleic acid 
sequences recited above, or nucleic acid sequences that encode a peptide comprising a specific 

15 domain or truncation of the peptides encoded by SEQ ID NO:1-1350. A polynucleotide 

comprising a nucleotide sequence having at least 90% identity to an identifying sequence of SEQ 
IDNO:1-1350 or a degenerate variant or fragment thereof. The identifying sequence can be 100 
base pairs in length. 

The nucleic acid sequences of the present invention also include the sequence information 
20 from the nucleic acid sequences of SEQ ID NO: 1 - 1 350 . The sequence information can be a 
segment of any one of SEQ ID NO: 1 - 1 350 that uniquely identifies or represents the sequence 
information ofSEQIDNO: 1-1350. 

A collection as used in this application can be a collection of only one polynucleotide. The 
collection of sequence information or identifying information of each sequence can be provided on 
25 a nucleic acid array. In one embodiment, segments of sequence information is provided on a 

nucleic acid array to detect the polynucleotide that contains the segment. The array can be designed 
to detect full-match or mismatch to the polynucleotide that contains the segment. The collection 
can also be provided in a computer-readable format 

This invention also includes the reverse or direct complement of any of the nucleic acid 
3 0 sequences recited above; cloning or expression vectors containing the nucleic acid sequences; and 
host cells or organisms transformed with these expression vectors. Nucleic acid sequences (or their 
reverse or direct complements) according to the invention have numerous applications in a variety 
of techniques known to those skilled in the art of molecular biology, such as use as hybridization 
probes, use as primers for PCR, use in an array, use in computer-readable media, use in sequencing 
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full-length genes, use for chromosome and gene mapping, use in the recombinant production of 
protein, and use in the generation of anti-sense DNA or RNA, their chemical analogs and the like. 

In a preferred embodiment, the nucleic acid sequences of SEQ ID NO: 1 -1 350 or novel 
segments or parts of the nucleic acids of the invention are used as primers in expression assays that 
5 are well known in the art. In a particularly preferred embodiment, the nucleic acid sequences of 
SEQ ID NO:l-l 350 or novel segments or parts of the nucleic acids provided herein are used in 
diagnostics for identifying expressed genes or, as well known in the art and exemplified by Vollrath 
et al., Science 258:52-59 (1992), as expressed sequence tags for physical mapping of the human 
genome. 

1 0 The isolated polynucleotides of the invention include, but are not limited to, a 

polynucleotide comprising any one of the nucleotide sequences set forth in SEQ ID NO: 1 - 1 350; a 
polynucleotide comprising any of the full length protein coding sequences of SEQ ID NO: 1 - 1 350; 
and a polynucleotide comprising any of the nucleotide sequences of the mature protein coding 
sequences of SEQ ID NO: 1 - 1 350. The polynucleotides of the present invention also include, but 
1 5 are not limited to, a polynucleotide that hybridizes under stringent hybridization conditions to (a) 
the complement of any one of the nucleotide sequences set forth in SEQ ID NO: 1-1 350; (b) a 
nucleotide sequence encoding any one of the amino acid sequences set forth in the Sequence Listing 
{e.g. , SEQ ID NO: 1351 -2700); (c) a polynucleotide which is an allelic variant of any 
polynucleotides recited above; (d) a polynucleotide which encodes a species homolog (e.g. 
20 orthologs) of any of the proteins recited above; or (e) a polynucleotide that encodes a polypeptide 
comprising a specific domain or truncation of any of the polypeptides comprising an amino acid 
sequence set forth in the Sequence Listing. 

The isolated polypeptides of the invention include, but are not limited to, a polypeptide 
comprising any of the amino acid sequences set forth in the Sequence Listing; or the corresponding 
25 full length or mature protein. Polypeptides of the invention also include polypeptides with 

biological activity that are encoded by (a) any of the polynucleotides having a nucleotide sequence 
set forth in SEQ ID NO:1-1350; or (b) polynucleotides that hybridize to the complement of the 
polynucleotidesof (a) under stringent hybridization conditions. Biologically or immunologicaUy 
active variants of any of the polypeptide sequences in the Sequence Listing, and "substantial 
30 equivalents" thereof (e.g., with at least about 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or 99% 
amino acid sequence identity) that preferably retain biological activity are also contemplated. The 
polypeptides of the invention may be wholly or partially chemically synthesized but are preferably 
produced by recombinant means using the genetically engineered cells (e.g. host cells) of the 
invention. 
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The invention also provides compositions comprising a polypeptide of the invention. 
Polypeptide compositions of the invention may further comprise an acceptable carrier, such as a 
hydrophilic, e.g., pharmaceutical^ acceptable, carrier. 

The invention also provides host cells transformed or transfected with a polynucleotide of 

5 the invention. 

The invention also relates to methods for producing a polypeptide of the invention 
comprising growing a culture of the host cells of the invention in a suitable culture medium 
under conditions permitting expression of the desired polypeptide, and purifying the polypeptide 
from the culture or from the host cells. Preferred embodiments include those in which the 

1 0 protein produced by such process is a mature form of the protein. 

Polynucleotides according to the invention have numerous applications in a variety of 
techniques known to those skilled in the art of molecular biology. These techniques include use 
as hybridization probes, use as oligomers, or primers, for PCR, use for chromosome and gene 
mapping, use in the recombinant production of protein, and use in generation of anti-sense DNA 

1 5 or RNA, their chemical analogs and the like. For example, when the expression of an mRNA is 
largely restricted to a particular cell or tissue type, polynucleotides of the invention can be used 
as hybridization probes to detect the presence of the particular cell or tissue mRNA in a sample 

using, e.g., in situ hybridization. 

In other exemplary embodiments, the polynucleotides are used in diagnostics as 
20 expressed sequence tags for identifying expressed genes or, as well known in the art and 
exemplified by Vollrath et al., Science 258:52-59 (1992), as expressed sequence tags for 
physical mapping of the human genome. 

The polypeptides according to the invention can be used in a variety of conventional 
procedures and methods that are currently applied to other proteins. For example, a polypeptide 
25 of the invention can be used to generate an antibody that specifically binds the polypeptide. Such 
antibodies, particularly monoclonal antibodies, are useful for detecting or quantitating the 
polypeptide in tissue. The polypeptides of the invention can also be used as molecular weight 
markers, and as a food supplement. 

Methods are also provided for preventing, treating, or ameliorating a medical condition 
30 which comprises the step of administering to a mammalian subject a therapeutically effective 
amount of a composition comprising a polypeptide of the present invention and a 
pharmaceutically acceptable carrier. 

In particular, the polypeptides and polynucleotides of the invention can be utilized, for 
example, in methods for the prevention and/or treatment of disorders involving aberrant protein 
35 expression or biological activity. 
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The present invention further relates to methods for detecting the presence of the 
polynucleotides or polypeptides of the invention in a sample. Such methods can, for example, b 
utilized as part of prognostic and diagnostic evaluation of disorders as recited herein and for the 
identification of subjects exhibiting a predisposition to such conditions. The invention provides 
5 a method for detecting the polynucleotides of the invention in a sample, comprising contacting 
the sample with a compound that binds to and forms a complex with the polynucleotide of 
interest for a period sufficient to form the complex and under conditions sufficient to form a 
complex and detecting the complex such that if a complex is detected, the polynucleotide of 
interest is detected. The invention also provides a method for detecting the polypeptides of the 
10 invention in a sample comprising contacting the sample with a compound that binds to and fonr 
a complex with the polypeptide under conditions and for a period sufficient to form the complex 
and detecting the formation of the complex such that if a complex is formed, the polypeptide is 
detected. 

The invention also provides kits comprising polynucleotide probes and/or monoclonal 
15 antibodies, and optionally quantitative standards, for carrying out methods of the invention. 
Furthermore, the invention provides methods for evaluating the efficacy of drugs, and 
monitoring the progress of patients, involved in clinical trials for the treatment of disorders as 
recited above. 

The invention also provides methods for the identification of compounds that modulate 

20 (i.e., increase or decrease) the expression or activity of the polynucleotides and/or polypeptides 
of the invention. Such methods can be utilized, for example, for the identification of compound: 
that can ameliorate symptoms of disorders as recited herein. Such methods can include, but are 
not limited to, assays for identifying compounds and other substances that interact with (e.g., 
bind to) the polypeptides of the invention. The invention provides a method for identifying a 

25 compound that binds to the polypeptides of the invention comprising contacting the compound 
with a polypeptide of the invention in a cell for a time sufficient to form a 
polypeptide/compound complex, wherein the complex drives expression of a reporter gene 
sequence in the cell; and detecting the complex by detecting the reporter gene sequence 
expression such that if expression of the reporter gene is detected the compound the binds to a 

30 polypeptide of the invention is identified. 

The methods of the invention also provides methods for treatment which involve the 
administration of the polynucleotides or polypeptides of the invention to individuals exhibiting 
symptoms or tendencies. In addition, the invention encompasses methods for treating diseases o 
disorders as recited herein comprising administering compounds and other substances that 

35 modulate the overall activity of the target gene products. Compounds and other substances can 
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effect such modulation either on the level of target gene/protein expression or target protein 
activity. 

The polypeptides of the present invention and the polynucleotides encoding them are also 
useful for the same functions known to one of skill in the art as the polypeptides and 
5 polynucleotides to which they have homology (set forth in Table 2). If no homology is set forth 
for a sequence, then the polypeptides and polynucleotides of the present invention are useful for 
a variety of applications, as described herein, including use in arrays for detection. 
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4. DETAILED DESCRIPTION OF THE INVENTION 



4.1 DEFINITIONS 

It must be noted that as used herein and in the appended claims, the singular forms "a", 
"an" and "the" include plural references unless the context clearly dictates otherwise. 

1 5 The term "active" refers to those forms of the polypeptide which retain the biologic 

and/or immunologic activities of any naturally occurring polypeptide. According to the 
invention, the terms "biologically active" or "biological activity" refer to a protein or peptide 
having structural, regulatory or biochemical functions of a naturally occurring molecule. 
Likewise "immunologically active" or "immunological activity" refers to the capability of the 

20 natural, recombinant or synthetic polypeptide to induce a specific immune response in 
appropriate animals or cells and to bind with specific antibodies. 

The term "activated cells" as used in this application are those cells which are engaged in 
extracellular or intracellular membrane trafficking, including the export of secretory or 
enzymatic molecules as part of a normal or disease process. 

25 The terms "complementary" or "complementarity" refer to the natural binding of 

polynucleotides by base pairing. For example, the sequence 5'-AGT-3' binds to the 
complementary sequence 3'-TCA-5\ Complementarity between two single-stranded molecules 
may be "partial" such that only some of the nucleic acids bind or it may be "complete" such that 
total complementarity exists between the single stranded molecules. The degree of 

30 complementarity between the nucleic acid strands has significant effects on the efficiency and 
strength of the hybridization between the nucleic acid strands. 

The term "embryonic stem cells (ES)" refers to a cell that can give rise to many 
differentiated cell types in an embryo or an adult, including the germ cells. The term "germ line 
stem cells (GSCs)" refers to stem cells derived from primordial stem cells that provide a steady 

35 and continuous source of germ cells for the production of gametes. The term "primordial germ 
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cells (PGCs)" refers to a small population of cells set aside from other cell lineages particularly 
from the yolk sac, mesenteries, or gonadal ridges during embryogenesis that have the potential to 
differentiate into germ cells and other cells. PGCs are the source from which GSCs and ES cells 
are derived The PGCs, the GSCs and the ES cells are capable of self-renewal. Thus these cells 

5 not only populate the germ line and give rise to a plurality of terminally differentiated cells that 
comprise the adult specialized organs, but are able to regenerate themselves. 

The term "expression modulating fragment," EMF, means a series of nucleotides which 
modulates the expression of an operably linked ORF or another EMF. 

As used herein, a sequence is said to "modulate the expression of an operably linked 

1 0 sequence" when the expression of the sequence is altered by the presence of the EMF. EMFs 
include, but are not limited to, promoters, and promoter modulating sequences (inducible 
elements). One class of EMFs are nucleic acid fragments which induce the expression of an 
operably linked ORF in response to a specific regulatory factor or physiological event. 
The terms "nucleotide sequence" or "nucleic acid" or "polynucleotide" or 

1 5 "oligonculeotide" are used interchangeably and refer to a heteropolymer of nucleotides or the 

sequence of these nucleotides. These phrases also refer to DNA or RNA of genomic or synthetic 
origin which may be single-stranded or double-stranded and may represent the sense or the 
antisense strand, to peptide nucleic acid (FNA) or to any DNA-like or RNA-like material. In the 
sequences herein A is adenine, C is cytosine, T is thymine, G is guanine and N is A, C, G or T 

20 (U). It is contemplated that where the polynucleotide is RNA, the T (thymine) in the sequences 
provided herein is substituted with U (uracil). Generally, nucleic acid segments provided by this 
invention may be assembled from fragments of the genome and short oligonucleotide linkers, or 
from a series of oligonucleotides, or from individual nucleotides, to provide a synthetic nucleic 
acid which is capable of being expressed in a recombinant transcriptional unit comprising 

25 regulator}' elements derived from a microbial or viral operon, or a eukaryotic gene. 

The terms "oligonucleotide fragment" or a "polynucleotide fragment", "portion," or 
"segment" or "probe" or "primer" are used interchangeably and refer to a sequence of nucleotide 
residues which are at least about 5 nucleotides, more preferably at least about 7 nucleotides, 
more preferably at least about 9 nucleotides, more preferably at least about 1 1 nucleotides and 

30 most preferably at least about 17 nucleotides. Ihe fragment is preferably less than about 500 
nucleotides, preferably less than about 200 nucleotides, more preferably less than about 1 00 
nucleotides, more preferably less than about 50 nucleotides and most preferably less than 30 
nucleotides. Preferably the probe is from about 6 nucleotides to about 200 nucleotides, 
preferably from about 15 to about 50 nucleotides, more preferably from about 17 to 30 

35 nucleotides and most preferably from about 20 to 25 nucleotides. Preferably the fragments can 
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be used in polymerase chain reaction (PCR), various hybridization procedures or microarray 
procedures to identify or amplify identical or related parts of mKNA or DNA molecules. A 
fragment or segment may uniquely identify each polynucleotide sequence of the present 
invention. Preferably the fragment comprises a sequence substantially similar to any one of SEQ 
IDNOs:l-1350. 

Probes may, for example, be used to determine whether specific mRNA molecules are 
present in a cell or tissue or to isolate similar nucleic acid sequences from chromosomal DNA as 
described by Walsh et al. (Walsh, P.S. et al., 1992, PCR Methods Appl 1 241-250). They may 
be labeled by nick translation, Klenow fill-in reaction, PCR, or other methods well known in the 
art. Probes of the present invention, their preparation and/or labeling are elaborated in 
Sambrook, J. et al., 1989, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor 
Laboratory, NY; or Ausubel, F.M. et al., 1989, Current Protocols in Molecular Biology, John 
Wiley & Sons, New York NY, both of which are incorporated herein by reference in their 
entirety. 

The nucleic acid sequences of the present invention also include the sequence 
information from the nucleic acid sequences of SEQ ID NO.1-1350. The sequence information 
can be a segment of any one of SEQ ID NO:1-1350 that uniquely identifies or represents the 
sequence information of that sequence of SEQ ID NO:1-1350. One such segment can be a 
twenty-mer nucleic acid sequence because the probability that a twenty-mer is fully matched in 
the human genome is 1 in 300. In the human genome, there are three billion base pairs in one set 
of chromosomes. Because 4 20 possible twenty-mers exist, there are 300 times more twenty-mers. 
than there are base pairs in a set of human chromosomes. Using the same analysis, the 
probability for a seventeen-mer to be fully matched in the human genome is approximately 1 in 
5. When these segments are used in arrays for expression studies, fifteen-mer segments can be 
used. The probability that the fifteen-mer is fully matched in the expressed sequences is also 
approximately one in five because expressed sequences comprise less than approximately 5% of 
the entire genome sequence. 

Similarly, when using sequence information for detecting a single mismatch, a segment can 
be a twenty-five mer. The probability that the twenty-five mer would appear in a human genome 
with a single mismatch is calculated by multiplying the probability for a full match (1-h* 25 ) times the 
increased probability for mismatch at each nucleotide position (3 x 25). The probability that an 
eighteen mer with a single mismatch can be detected in an array for expression studies is 
approximately one in five. The probability that a twenty-mer with a single mismatch can be 
detected in a human genome is approximately one in five. 
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The term "open reading frame," ORF, means a series of nucleotide triplets coding for 
Wino acids without any termination codons and is a sequence translatable into protein. 

The terms "operably linked" or "operably associated" refer to functionally related nucleic 
acid sequences. For example, a promoter is operably associated or operably linked with a coding 
5 sequence if the promoter controls the transcription of the coding sequence. While operably 
linked nucleic acid sequences can be contiguous and in the same reading frame, certain genetic 
elements e.g. repressor genes are not contiguously linked to the coding sequence but still control 
transcription/translation of the coding sequence. 

The term "pluripotent" refers to the capability of a cell to differentiate into a number of 
10 differentiated cell types that are present in an adult organism. A pluripotent cell is restricted in its 
differentiation capability in comparison to a totipotent cell. 

The terms "polypeptide" or "peptide" or "amino acid sequence" refer to an oligopeptide, 
peptide, polypeptide or protein sequence or fragment thereof and to naturally occurring or 
synthetic molecules. A polypeptide "fragment," "portion," or "segment" is a stretch of amino 
1 5 acid residues of at least about 5 amino acids, preferably at least about 7 amino acids, more 
preferably at least about 9 amino acids and most preferably at least about 17 or more amino 
acids. The peptide preferably is not greater than about 200 amino acids, more preferably less 
than 150 amino acids and most preferably less than 100 amino acids. Preferably the peptide is 
from about 5 to about 200 amino acids. To be active, any polypeptide must have sufficient 
20 length to display biological and/or immunological activity. 

The term "naturally occurring polypeptide" refers to polypeptides produced by cells that 
have not been genetically engineered and specifically contemplates various polypeptides arising 
from post-translational modifications of the polypeptide including, but not limited to, 
acetylation, carboxylation, glycosylate, phosphorylation, lipidation and acylation. 
25 The term "translated protein coding portion" means a sequence which encodes for the full 

length protein which may include any leader sequence or any processing sequence. 

The term "mature protein coding sequence" means a sequence which encodes a peptide 
or protein without a signal or leader sequence. The "mature protein portion" means that portion 
of the protein which does not include a signal or leader sequence. The peptide may have been 
30 produced by processing in the cell which removes any leader/signal sequence. The mature 

protein portion may or may not include the initial methionine residue. The methionine residue 
may be removed from the protein during processing in the cell. The peptide may be produced 
synthetically or the protein may have been produced using a polynucleotide only encoding for 
the mature protein coding sequence. 
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The term "derivative" refers to polypeptides chemically modified by such techniques as 
ubiquitination, labeling (e.g., with radionuclides or various enzymes), covalent polymer 
attachment such as pegylation (derivatization with polyethylene glycol) and insertion or 
substitution by chemical synthesis of amino acids such as ornithine, which do not normally occur 

5 in human proteins. 

The term "variant"(or "analog") refers to any polypeptide differing from naturally 
occurring polypeptides by amino acid insertions, deletions, and substitutions, created using, e g., 
recombinant DNA techniques. Guidance in determining which amino acid residues may be 
replaced, added or deleted without abolishing activities of interest, may be found by comparing 
10 the sequence of the particular polypeptide with that of homologous peptides and minimizing the 
number of amino acid sequence changes made in regions of high homology (conserved regions) 
or by replacing amino acids with consensus sequence. 

Alternatively, recombinant variants encoding these same or similar polypeptides may be 
synthesized or selected by making use of the "redundancy" in the genetic code. Various codon 
1 5 substitutions, such as the silent changes which produce various restriction sites, may be 
introduced to optimize cloning into a plasmid or viral vector or expression in a particular 
prokaryotic or eukaryotic system. Mutations in the polynucleotide sequence may be reflected in 
the polypeptide or domains of other peptides added to the polypeptide to modify the properties of 
any part of the polypeptide, to change characteristics such as ligand-binding affinities, interchain 
20 affinities, or degradation/turnover rate. 

Preferably, amino acid "substitutions" are the result of replacing one amino acid with 
another amino acid having similar structural and/or chemical properties, i.e., conservative amino 
acid replacements. "Conservative" amino acid substitutions may be made on the basis of 
similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic 
25 nature of the residues involved. For example, nonpolar (hydrophobic) amino acids include 
alanine, leucine, isoleucine, valine, proline, phenylalanine, tryptophan, and methionine; polar 
neutral amino acids include glycine, serine, threonine, cysteine, tyrosine, asparagine, and 
glutamine; positively charged (basic) amino acids include arginine, lysine, and histidine; and 
negatively charged (acidic) amino acids include aspartic acid and glutamic acid. "Insertions" or 
30 "deletions" are preferably in the range of about 1 to 20 amino acids, more preferably 1 to 1 0 

amino acids. The variation allowed may be experimentally determined by systematically making 
insertions, deletions, or substitutions of amino acids in a polypeptide molecule using 
recombinant DNA techniques and assaying the resulting recombinant variants for activity. 
Alternatively, where alteration of function is desired, insertions, deletions or 
3 5 non-conservative alterations can be engineered to produce altered polypeptides. Such alterations 
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can, for example, alter one or more of the biological functions or biochemical characteristics of 
the polypeptides of the invention. For example, such alterations may change polypeptide 
characteristics such as ligand-binding affinities, interchain affinities, or degradation/turnover 
rate. Further, such alterations can be selected so as to generate polypeptides that are better suited 
5 for expression, scale up and the like in the host cells chosen for expression. For example, 
cysteine residues can be deleted or substituted with another amino acid residue in order to 
eliminate disulfide bridges. 

The terms "purified" or "substantially purified" as used herein denotes that the indicated 
nucleic acid or polypeptide is present in the substantial absence of other biological 
10 macromolecules, e.g. , polynucleotides, proteins, and the like. In one embodiment, the 

polynucleotide or polypeptide is purified such that it constitutes at least 95% by weight, more 
preferably at least 99% by weight, of the indicated biological macromolecules present (but water, 
buffers, and other small molecules, especially molecules having a molecular weight of less than 
1000 daltons, can be present). 
1 5 The term "isolated" as used herein refers to a nucleic acid or polypeptide separated from 

at least one other component (e.g., nucleic acid or polypeptide) present with the nucleic acid or 
polypeptide in its natural source. In one embodiment, the nucleic acid or polypeptide is found in 
the presence of (if anything) only a solvent, buffer, ion, or other component normally present in a 
solution of the same. The terms "isolated" and "purified" do not encompass nucleic acids or 
20 polypeptides present in their natural source. 

The term "recombinant," when used herein to refer to a polypeptide or protein, means 
that a polypeptide or protein is derived from recombinant (e.g. , microbial, insect, or mammalian) 
expression systems. "Microbial" refers to recombinant polypeptides or proteins made in 
bacterial or fungal (e.g. , yeast) expression systems. As a product, "recombinant microbial" 
25 defines a polypeptide or protein essentially free of native endogenous substances and 

unaccompanied by associated native glycosylation. Polypeptides or proteins expressed in most 
bacterial cultures, e.g., E. coli, will be free of glycosylation modifications; polypeptides or 
proteins expressed in yeast will have a glycosylation pattern in general different from those 
expressed in mammalian cells. 
30 The term "recombinant expression vehicle or vector" refers to a plasmid or phage or virus 

or vector, for expressing a polypeptide from a DNA (RNA) sequence. An expression vehicle can 
comprise a transcriptional unit comprising an assembly of (1) a genetic element or elements 
having a regulatory role in gene expression, for example, promoters or enhancers, (2) a structural 
or coding sequence which is transcribed into mRNA and translated into protein, and (3) 
3 5 appropriate transcription initiation and termination sequences. Structural units intended for use 
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in yeast or eukaryotic expression systems preferably include a leader sequence enabling 
extracellular secretion of translated protein by a host cell. Alternatively, where recombinant 
protein is expressed without a leader or transport sequence, it may include an amino terminal 
methionine residue. This residue may or may not be subsequently cleaved from the expressed 
recombinant protein to provide a final product. 

The term "recombinant expression system" means host cells which have stably integrated 
a recombinant transcriptional unit into chromosomal DNA or carry the recombinant 
transcriptional unit extrachromosomally. Recombinant expression systems as defined herein will 
express heterologous polypeptides or proteins upon induction of the regulatory elements linked 
to the DNA segment or synthetic gene to be expressed. This term also means host cells which 
have stably integrated a recombinant genetic element or elements having a regulatory role in 
gene expression, for example, promoters or enhancers. Recombinant expression systems as 
defined herein will express polypeptides or proteins endogenous to the cell upon induction of the 
regulatory elements linked to the endogenous DNA segment or gene to be expressed. The cells 
15 can be prokaryotic or eukaryotic. 

The term "secreted" includes a protein that is transported across or through a membrane, 
including transport as a result of signal sequences in its amino acid sequence when it is 
expressed in a suitable host cell. "Secreted" proteins include without limitation proteins secreted 
wholly (e.g., soluble proteins) or partially- (e.g., receptors) from the cell in which they are 
20 expressed. "Secreted" proteins also include without limitation proteins that are transported 
across the membrane of the endoplasmic reticulum. "Secreted" proteins are also intended to 
include proteins containing non-typical signal sequences (e.g. Interleukin-1 Beta, see Kxasney, 
P.A and Young, P.R. (1992) Cytokine 4(2): 134 -143) and factors released from damaged cells 
(e.g. Interleukin-1 Receptor Antagonist, see Arend, W.P. et. al. (1998) Annu. Rev. Immunol. 
25 16:27-55) 

Where desired, an expression vector may be designed to contain a "signal or leader 
sequence" which will direct the polypeptide through the membrane of a cell. Such a sequence 
may be naturally present on the polypeptides of the present invention or provided from 
heterologous protein sources by recombinant DNA techniques. 

30 The term "stringent" is used to refer to conditions that are commonly understood in the 

art as stringent. Stringent conditions can include highly stringent conditions (i.e., hybridization 
to filter-bound DNA in 0.5 M NaHP0 4 , 7% sodium dodecyl sulfate (SDS), 1 mM EDTA at 
65°C, and washing in 0.1 X SSC/0.1% SDS at 68°C), and moderately stringent conditions (i.e., 
washing in 0.2X SSC/0.1% SDS at 42°C). Other exemplary hybridization conditions are 

35 described herein in the examples. 
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In instances of hybridization of deoxyoligonucleotides, additional exemplary stringent 
hybridization conditions include washing in 6X SSC/0.05% sodium pyrophosphate at 37°C (for 
14-base oligonucleotides), 48°C (for 17-base oligos), 55°C (for 20-base oligonucleotides), and 
60°C (for 23-base oligonucleotides). 
5 As used herein, "substantially equivalent" can refer both to nucleotide and amino acid 

sequences, for example a mutant sequence, that varies from a reference sequence by one or more 
substitutions, deletions, or additions, the net effect of which does not result in an adverse 
functional dissimilarity between the reference and subject sequences. Typically, such a 
substantially equivalent sequence varies from one of those listed herein by no more than about 
10 35% (/. e. , the number of individual residue substitutions, additions, and/or deletions in a 

substantially equivalent sequence, as compared to the corresponding reference sequence, divided 
by the total number of residues in the substantially equivalent sequence is about 0.35 or less). 
Such a sequence is said to have 65% sequence identity to the listed sequence. In one 
embodiment, a substantially equivalent, e.g., mutant, sequence of the invention varies from a 
1 5 listed sequence by no more than 30% (70% sequence identity); in a variation of this 

embodiment, by no more than 25% (75% sequence identity); and in a further variation of this 
embodiment, by no more than 20% (80% sequence identity) and in a further variation of this 
embodiment, by no more than 1 0% (90% sequence identity) and in a further variation of this 
embodiment, by no more that 5% (95% sequence identity). Substantially equivalent, e.g., 
20 mutant, amino acid sequences according to the invention preferably have at least 80% sequence 
identity with a listed amino acid sequence, more preferably at least 85% sequence identity, more 
preferably at least 90% sequence identity, more preferably at least 95% identity, more preferably 
at least 98% identity, and most preferably at least 99% identity. Substantially equivalent 
nucleotide sequences of the invention can have lower percent sequence identities, taking into 
25 account, for example, the redundancy or degeneracy of the genetic code. Preferably, nucleotide 
sequence has at least about 65% identity, more preferably at least about 75% identity, more 
preferably at least about 80% sequence identity, more preferably at least about 85% sequence 
identity, more preferably at least about 90% sequence identity, and most preferably at least about 
95% identity, more preferably at least about 98% sequence identity, and most preferably at least 
30 about 99% sequence identity. For the purposes of the present invention, sequences having 

substantially equivalent biological activity and substantially equivalent expression characteristics 
are considered substantially equivalent. For the purposes of determining equivalence, truncation 
of the mature sequence (e.g., via a mutation which creates a spunous stop codon) should be 
disregarded. Sequence identity may be determined, e.g., using the Jotun Hein method (Hein, J. 
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(1990) Methods Enzymol. 183:626-645). Identity between sequences can also be detennined by 
other methods know in the art, e.g. by varying hybridization conditions. 

The term "totipotent" refers to the capability of a cell to differentiate into all of the cell 

types of an adult organism. 

5 The term "transformation" means introducing DNA into a suitable host cell so that the 

DNA is replicable, either as an extrachromosomal element, or by chromosomal integration. The 
term "transfection" refers to the taking up of an expression vector by a suitable host cell, whether 
or not any coding sequences are in fact expressed. The term "infection" refers to the introduction 
of nucleic acids into a suitable host cell by use of a virus or viral vector. 

10 As used herein, an "uptake modulating fragment," UMF, means a series of nucleotides 

which mediate the uptake of a linked DNA fragment into a cell. UMFs can be readily identified 
using known UMFs as a target sequence or target motif with the computer-based systems 
described below. The presence and activity of a UMF can be confirmed by attaching the 
suspected UMF to a marker sequence. The resulting nucleic acid molecule is then incubated 

15 with an appropriate host under appropriate conditions and the uptake of the marker sequence is 
determined. As described above, a UMF will increase the frequency of uptake of a linked 
marker sequence. 

Each of the above terms is meant to encompass all that is described for each, unless the 
context dictates otherwise. 

20 

4.2 NUCLEIC ACIDS OF THE INVENTION 

Nucleotide sequences of the invention are set forth in the Sequence Listing. 
The isolated polynucleotides of the invention include a polynucleotide comprising the 
nucleotide sequences of SEQ ID NO:M350 ; a polynucleotide encoding any one of the peptide 

25 sequences of SEQ ID NO: 1 35 1 -2700; and a polynucleotide comprising the nucleotide sequence 
encoding the mature protein coding sequence of the polypeptides of any one of SEQ ID 
NO: 1 35 1 -2700. The polynucleotides of the present invention also include, but are not limited to, 
a polynucleotide that hybridizes under stringent conditions to (a) the complement of any of the 
nucleotides sequences of SEQ ID NO:1-1350 ; (b) nucleotide sequences encoding any one of the 

30 amino acid sequences set forth in the Sequence Listing; (c) a polynucleotide which is an allelic 
variant of any polynucleotide recited above; (d) a polynucleotide which encodes a species 
homolog of any of the proteins recited above; or (e) a polynucleotide that encodes a polypeptide 
comprising a specific domain or truncation of the polypeptides of SEQ ID NO: 1 35 1 -2700. 
Domains of interest may depend on the nature of the encoded polypeptide; e.g., domains in 

35 receptor-like polypeptides include ligand-binding, extracellular, transmembrane, or cytoplasmic 
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domains, or combinations thereof; domains in immunoglobulin-like proteins include the variable 
immunoglobulin-like domains; domains in enzyme-like polypeptides include catalytic and 
substrate binding domains; and domains in ligand polypeptides include receptor-binding 
domains. 

5 The polynucleotides of the invention include naturally occurring or wholly or partially 

synthetic DNA, e.g., cDNA and genomic DNA, and RNA, e.g., mRNA. The polynucleotides 
may include all of the coding region of the cDNA or may represent a portion of the coding 
region of the cDNA. 

The present invention also provides genes corresponding to the cDNA sequences disclosed 
1 0 herein. The corresponding genes can be isolated in accordance with known methods using the 

sequence information disclosed herein. Such methods include the preparation of probes or primers 
from the disclosed sequence information for identification and/or amplification of genes in 
appropriate genomic libraries or other sources of genomic materials. Further 5' and 3' sequence can 
be obtained using methods known in the art For example, full length cDNA or genomic DNA that 
15 corresponds^ any of the polynucleotides of SEQ ID NO:1-1350 can be obtained by screening 
appropriate cDNA or genomic DNA libraries under suitable hybridization conditions using any of 
the polynucleotides of SEQ ID NO:l - 1 350 or a portion thereof as a probe. Alternatively, the 
polynucleotides of SEQ ID NO:1-1350 may be used as the basis for suitable primer(s) that allow 
identificationand/or amplification of genes in appropriate genomic DNA or cDNA libraries. 
20 The nucleic acid sequences of the invention can be assembled from ESTs and sequences 

(including cDNA and genomic sequences) obtained from one or more public databases, such as 
dbEST, gbpri, and UniGene. The EST sequences can provide identifying sequence information, 
representative fragment or segment information, or novel segment information for the full-length 
gene. 

25 The polynucleotides of the invention also provide polynucleotides including nucleotide 

sequences that are substantially equivalent to the polynucleotides recited above. Polynucleotides 
according to the invention can have, e.g., at least about 65%, at least about 70%, at least about 
75%, at least about 80%, 81%, 82%, 83%, 84%, more typically at least about 85%, 86%, 87%, 
88%, 89%, more typically at least about 90%, 91 %, 92%, 93%, 94%, and even more typically at 

30 least about 95%, 96%, 97%, 98%, 99%, sequence identity to a polynucleotide recited above. 

Included within the scope of the nucleic acid sequences of the invention are nucleic acid 
sequence fragments that hybridize under stringent conditions to any of the nucleotide sequences 
of SEQ ID NO: 1 -1350, or complements thereof, which fragment is greater than about 5 
nucleotides, preferably 7 nucleotides, more preferably greater than 9 nucleotides and most 

35 preferably greater than 1 7 nucleotides. Fragments of, e.g. 15,17, or 20 nucleotides or more that 
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are selective for {i.e. specifically hybridize to any one of the polynucleotides of the invention) 
are contemplated. Probes capable of specifically hybridizing to a polynucleotide can 
differentiate polynucleotide sequences of the invention from other polynucleotide sequences in 
the same family of genes or can differentiate human genes from genes of other species, and are 

5 preferably based on unique nucleotide sequences. 

The sequences falling within the scope of the present invention are not limited to these 
specific sequences, but also include allelic and species variations thereof. Allelic and species 
variations can be routinely determined by comparing the sequence provided SEQ ID NO:1-1350, a 
representative fragment thereof, or a nucleotide sequence at least 90% identical, preferably 95% 

1 0 identical, to SEQ ID NO : 1 - 1 3 50 with a sequence from another isolate of the same species. 

Furthermore, to accommodate codon variability, the invention includes nucleic acid molecules 
coding for the same amino acid sequences as do the specific ORFs disclosed herein. In other words, 
in the coding region of an ORF, substitution of one codon for another codon that encodes the same 
amino acid is expressly contemplated. 

1 5 The nearest neighbor or homology result for the nucleic acids of the present invention, 

including SEQ ID NO: 1-1350, can be obtained by searching a database using an algorithm or a 
program. Preferably, a BLAST which stands for Basic Local Alignment Search Tool is used to 
search for local sequence alignments (Altshul, ST. J Mol. Evol. 36 290-300 (1 993) and Altschul 
S.F. et al. J. Mol. Biol. 21 :403-41 0 (1 990)). Alternatively a FASTA version 3 search against 

20 Genpept, using Fastxy algorithm. 

Species homologs (or orthologs) of the disclosed polynucleotides and proteins are also 
provided by the present invention. Species homologs may be isolated and identified by making 
suitable probes or primers from the sequences provided herein and screening a suitable nucleic 
acid source from the desired species. 

25 The invention also encompasses allelic variants of the disclosed polynucleotides or 

proteins; that is, naturally-occurring alternative forms of the isolated polynucleotide which also 
encode proteins which are identical, homologous or related to that encoded by the 
polynucleotides. 

The nucleic acid sequences of the invention are further directed to sequences which 
3 0 encode variants of the described nucleic acids. These amino acid sequence variants may be 
prepared by methods known in the art by introducing appropriate nucleotide changes into a 
native or variant polynucleotide. There are two variables in the construction of amino acid 
sequence variants: the location of the mutation and the nature of the mutation. Nucleic acids 
encoding the amino acid sequence variants are preferably constructed by mutating the 
3 5 polynucleotide to encode an amino acid sequence that does not occur in nature. These nucleic 
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acid alterations can be made at sites that differ in the nucleic acids from different species 
(variable positions) or in highly conserved regions (constant regions). Sites at such locations 
will typically be modified in senes, e.g., by substituting first with conservative choices {e.g., 
hydrophobic amino acid to a different hydrophobic amino acid) and then with more distant 
5 choices (e.g., hydrophobic amino acid to a charged amino acid), and then deletions or insertions 
may be made at the target site. Amino acid sequence deletions generally range from about 1 to 
30 residues, preferably about 1 to 10 residues, and are typically contiguous. Amino acid 
insertions include amino- and/or carboxyl-terminal fusions ranging in length from one to one 
hundred or more residues, as well as intrasequence insertions of single or multiple amino acid 
1 0 residues. Intrasequence insertions may range generally from about 1 to 1 0 amino residues, 

preferably from 1 to 5 residues. Examples of terminal insertions include the heterologous signal 
sequences necessary for secretion or for intracellular targeting in different host cells and 
sequences such as FLAG or poly-histidine sequences useful for purifying the expressed protein. 
In a preferred method, polynucleotides encoding the novel amino acid sequences are 
15 changed via site-directed mutagenesis. This method uses oligonucleotide sequences to alter a 
polynucleotide to encode the desired amino acid variant, as well as sufficient adjacent 
nucleotides on both sides of the changed amino acid to form a stable duplex on either side of the 
site of being changed. In general, the techniques of site-directed mutagenesis are well known to 
those of skill in the art and this technique is exemplified by publications such as, Edelman et al., 
20 DNA 2:183 (1983). A versatile and efficient method for producing site-specific changes in a 
polynucleotide sequence was published by Zoller and Smith, Nucleic Acids Res. 10:6487-6500 
(1982). PCR may also be used to create amino acid sequence variants of the novel nucleic acids. 
When small amounts of template DNA are used as starting material, primer(s) that differs 
slightly in sequence from the corresponding region in the template DNA can generate the desired 
25 amino acid variant. PCR amplification results in a population of product DNA fragments that 
differ from the polynucleotide template encoding the polypeptide at the position specified by the 
primer. The product DNA fragments replace the corresponding region in the plasmid and this 
gives a polynucleotide encoding the desired amino acid variant. 

A further technique for generating amino acid variants is the cassette mutagenesis 
30 technique described in Wells et al., Gene 34:315 (1985); and other mutagenesis techniques well 
known in the art, such as, for example, the techniques in Sambrook et al, supra, and Current 
Protocols in Molecular Biology, Ausubel et al. Due to the inherent degeneracy of the genetic 
code, other DNA sequences which encode substantially the same or a functionally equivalent 
amino acid sequence may be used in the practice of the invention for the cloning and expression 
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of these novel nucleic acids. Such DNA sequences include those which are capable of 
hybridizing to the appropriate novel nucleic acid sequence under stringent conditions. 

Polynucleotides encoding preferred polypeptide truncations of the invention can be used 
to generate polynucleotides encoding chimeric or fusion proteins comprising one or more 
domains of the invention and heterologous protein sequences. 

The polynucleotides of the invention additionally include the complement of any of the 
polynucleotides recited above. The polynucleotide can be DNA (genomic, cDNA, amplified, or 
synthetic) or KNA. Methods and algorithms for obtaining such polynucleotides are well known 
to those of skill in the art and can include, for example, methods for determining hybridization 
conditions that can routinely isolate polynucleotides of the desired sequence identities. 

In accordance with the invention, polynucleotide sequences comprising the mature 
protein coding sequences corresponding to any one of SEQ ID NO:1-1350, or functional 
equivalents thereof, may be used to generate recombinant DNA molecules that direct the 
expression of that nucleic acid, or a functional equivalent thereof, in appropriate host cells. Also 
1 5 included are the cDNA inserts of any of the clones identified herein. 

A polynucleotide according to the invention can be joined to any of a variety of other 
nucleotide sequences by well-established recombinant DNA techniques (see Sambrook J et al. 
(1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, NY). Useful 
nucleotide sequences for joining to polynucleotides include an assortment of vectors, e.g., 
20 plasmids, cosmids, lambda phage derivatives, phagemids, and the like, that are well known in the 
art. Accordingly, the invention also provides a vector including a polynucleotide of the 
invention and a host cell containing the polynucleotide. In general, the vector contains an origin 
of replication functional in at least one organism, convenient restriction endonuclease sites, and a 
selectable marker for the host cell. Vectors according to the invention include expression 
25 vectors, replication vectors, probe generation vectors, and sequencing vectors. A host cell 
according to the invention can be a prokaryotic or eukaryotic cell and can be a unicellular 
organism or part of a multicellular organism. 

The present invention further provides recombinant constructs comprising a nucleic acid 
having any of the nucleotide sequences of SEQ ID NO:1-1350 or a fragment thereof or any other 
30 polynucleotides of the invention. In one embodiment, the recombinant constructs of the present 
invention comprise a vector, such as a plasmid or viral vector, into which a nucleic acid having 
any of the nucleotide sequences of SEQ ID NO.1-1350 or a fragment thereof is inserted, in a 
forward or reverse orientation. In the case of a vector comprising one of the ORFs of the'present 
invention, the vector may further comprise regulatory sequences, including for example, a 
35 promoter, operably linked to the ORE. Large numbers of suitable vectors and promoters are 
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known to those of skill in the art and are commercially available for generating the recombinant 
constructs of the present invention. The following vectors are provided by way of example. 
Bacterial: P Bs, phagescript, PsiX174, pBluescript SK, pBs KS, pNH8a, pNHlfe, P NH18a, 
pNH46a (Stratagene); P Trc99A, P KK223-3, P KK233-3, pDR540, pRIT5 (Pharmacia). 
5 Eukaryotic: pWLneo, P SV2cat, pOG44, PXTI, pSG (Stratagene) pSVK3, pBPV, pMSG, pSVL 
(Pharmacia). 

The isolated polynucleotide of the invention may be operably linked to an expression 
control sequence such as the pMT2 or pED expression vectors disclosed in Kaufman et al, 
Nucleic Acids Res. 19, 4485-4490 (1991), in order to produce the protein recombinantly. Many 
10 suitable expression control sequences are known in the art. General methods of expressing 
recombinant proteins are also known and are exemplified in R. Kaufman, Methods in 
Enzymology 185, 537-566 (1990). As defined herein "operably linked" means that the isolated 
polynucleotide of the invention and an expression control sequence are situated within a vector 
or cell in such a way that the protein is expressed by a host cell which has been transformed 
15 (transfected) with the ligated polynucleotide/expression control sequence. 

Promoter regions can be selected from any desired gene using CAT (chloramphenicol 
transferase) vectors or other vectors with selectable markers. Two appropriate vectors are 
pKK232-8 and pCM7. Particular named bacterial promoters include lad, lacZ, T3, T7, gpt, 
lambda PR, and trc. Eukaryotic promoters include CMV immediate early, HSV thymidine 
20 kinase, early and late SV40, LTRs from retrovirus, and mouse metallothionein-I. Selection of 
the appropriate vector and promoter is well within the level of ordinary skill in the art. 
Generally, recombinant expression vectors will include origins of replication and selectable 
markers permitting transformation of the host cell, e.g., the ampicillin resistance gene of E. coli 
and S. cerevisiae TRP1 gene, and a promoter derived from a highly-expressed gene to direct 
25 transcription of a downstream structural sequence. Such promoters can be derived from operons 
encoding glycolytic enzymes such as 3-phosphoglycerate kinase (PGK), a-factor, acid 
phosphatase, or heat shock proteins, among others. The heterologous structural sequence is 
assembled in appropriate phase with translation initiation and termination sequences, and 
preferably, a leader sequence capable of directing secretion of translated protein into the 
30 periplasms space or extracellular medium. Optionally, the heterologous sequence can encode a 
fusion protein including an amino terminal identification peptide imparting desired 
characteristics, e.g., stabilization or simplified purification of expressed recombinant product. 
Useful expression vectors for bacterial use are constructed by inserting a structural DNA 
sequence encoding a desired protein together with suitable translation initiation and termination 
35 signals in operable reading phase with a functional promoter. The vector will comprise one or 



19 



WO 01/57188 



PCT/USO 1/03800 



more phenotypic selectable markers and an origin of replication to ensure maintenance of the 
vector and to, if desirable, provide amplification within the host. Suitable prokaryotic hosts for 
transformation include E. coli, Bacillus subtilis, Salmonella typhimurium and various species 
within the genera Pseudomonas, Streptomyces, and Staphylococcus, although others may also be 

employed as a matter of choice. 

As a representative but non-limiting example, useful expression vectors for bacterial use 
can comprise a selectable marker and bacterial origin of replication derived from commercially 
available plasmids comprising genetic elements of the well known cloning vector pBR322 
(ATCC 37017). Such commercial vectors include, for example, pKK223-3 (Pharmacia Fine 
Chemicals, Uppsala, Sweden) and GEM 1 (Promega Biotech, Madison, WI, USA). These 
pBR322 "backbone" sections are combined with an appropriate promoter and the structural 
sequence to be expressed. Following transformation of a suitable host strain and growth of the 
host strain to an appropriate cell density, the selected promoter is induced or derepressed by 
appropriate means (e.g., temperature shift or chemical induction) and cells are cultured for an 
additional period. Cells are typically harvested by centrifugation, disrupted by physical or 
chemical means, and the resulting crude extract retained for ftirther purification. . 

Polynucleotides of the invention can also be used to induce immune responses. For 
example, as described in Fan et aL, Nat. Biotech. 17:870-872 (1999), incorporated herein by 
reference, nucleic acid sequences encoding a polypeptide may be used to generate antibodies 
against the encoded polypeptide following topical admimstration of naked plasmid DNA or 
following injection, and preferably intramuscular injection of the DNA. The nucleic acid 
sequences are preferably inserted in a recombinant expression vector and may be in the form of 
naked DNA. 



4.3 ANTISENSE 

Another aspect of the invention pertains to isolated antisense nucleic acid molecules that 
are hybridizable to or complementary to the nucleic acid molecule comprising the nucleotide 
sequence of SEQ ID NO:1-1350, or fragments, analogs or derivatives thereof. An "antisense" 
nucleic acid comprises a nucleotide sequence that is complementary to a "sense" nucleic acid 
encoding a protein, e.g., complementary to the coding strand of a double-stranded cDNA 
molecule or complementary to an mRNA sequence. In specific aspects, antisense nucleic acid 
molecules are provided that comprise a sequence complementary to at least about 10, 25, 50, 
100, 250 or 500 nucleotides or an entire coding strand, or to only a portion thereof. Nucleic acid 
molecules encoding fragments, homologs, derivatives and analogs of a protein of any of SEQ ID 
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NO:1351-2700 or antisense nucleic acids complementary to a nucleic acid sequence of SEQ ID 
NO: 1-1 350 are additionally provided. 

In one embodiment, an antisense nucleic acid molecule is antisense to a "coding region" 
of the coding strand of a nucleotide sequence of the invention. The term "coding region" refers 
5 to the region of the nucleotide sequence comprising codons which are translated into amino acid 
residues. In another embodiment, the antisense nucleic acid molecule is antisense to a 
"noncoding region" of the coding strand of a nucleotide sequence of the invention. The term 
"noncoding region" refers to 5' and 3' sequences which flank the coding region that are not 
translated into amino acids (i.e., also referred to as 5' and 3' untranslated regions). 
10 Given the coding strand sequences encoding a nucleic acid disclosed herein (e.g., SEQ 

ID NOT-1350), antisense nucleic acids of the invention can be designed according to the rules 
of Watson and Crick or Hoogsteen base pairing. The antisense nucleic acid molecule can be 
complementary to the entire coding region of a mRNA, but more preferably is an oligonucleotide 
that is antisense to only a portion of the coding or noncoding region of a mRNA. For example, 
1 5 the antisense oligonucleotide can be complementary to the region surrounding the translation 
start site of a mRNA. An antisense oligonucleotide can be, for example, about 5, 10, 15, 20, 25, 
30, 35, 40, 45 or 50 nucleotides in length. An antisense nucleic acid of the invention can be 
constructed using chemical synthesis or enzymatic ligation reactions using procedures known in 
the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can be 
20 chemically synthesized using naturally occurring nucleotides or variously modified nucleotides 
designed to increase the biological stability of the molecules or to increase the physical stability 
of the duplex formed between the antisense and sense nucleic acids, e.g., phosphorothioate 
derivatives and acridine substituted nucleotides can be used. 

Examples of modified nucleotides that can be used to generate the antisense nucleic acid 
25 include: 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracU, hypoxanthine, xanthine, 
4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5-carboxymethylaminomethyl- 
2-thiouridine, 5-carboxymethylaminomethyluracil, dihydrouracih beta-D-galactosylqueosine, 
inosine,N6-isopentenyladenine, 1 -methylguanine, 1 -methylinosine, 2,2-dimethylguanine, 
2-methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine, 
30 7-methylguanine, 5-methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil, 
beta-D-mannosylqueosine, 5'-methoxycarboxymethyluracil, 5-methoxyuracil, 
2-methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, 
queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil, 
uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v), 5-methyl-2-thiouracil, 
35 3-(3-amino-3-N-2-carboxypropyl) uracil, (ac P 3)w, and 2,6-diaminopurine. Alternatively, the 
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antisense nucleic acid can be produced biologically using an expression vector into which a 
nucleic acid has been subcloned in an antisense orientation {i.e., RNA transcribed from the 
inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, 
described further in the following subsection). 

5 The antisense nucleic acid molecules of the invention are typically administered to a 

subject or generated in situ such that they hybridize with or bind to cellular mRNA and/or 
genomic DNA encoding a protein according to the invention to thereby inhibit expression of the 
protein, e.g., by inhibiting transcription and/or translation. The hybridization can be by 
conventional nucleotide complementarity to form a stable duplex, or, for example, in the case of 

1 0 an antisense nucleic acid molecule that binds to DNA duplexes, through specific interactions in 
the major groove of the double helix. An example of a route of administration of antisense 
nucleic acid molecules of the invention includes direct injection at a tissue site. Alternatively, 
antisense nucleic acid molecules can be modified to target selected cells and then administered 
systemically. For example, for systemic administration, antisense molecules can be modified 

1 5 such that they specifically bind to receptors or antigens expressed on a selected ceU surface, e.g. , 
by linking the antisense nucleic acid molecules to peptides or antibodies that bind to cell surface 
receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using 
the vectors described herein. To achieve sufficient intracellular concentrations of antisense 
molecules, vector constructs in which the antisense nucleic acid molecule is placed under the 

20 control of a strong pol D or pol III promoter are preferred. 

In yet another embodiment, the antisense nucleic acid molecule of the invention is an 
-a n omeric nucleic acid molecule. An -a nomeric nucleic acid molecule forms specific 
double-stranded hybrids with complementary RNA in which, contrary to the usual -uni ts, the 
strands run parallel to each other (Gaultier et al. (1987) Nucleic Acids Res 15: 6625-6641). The 

25 antisense nucleic acid molecule can also comprise a 2'-o-methylribonucleotide (Inoue et al. 

(1987) Nucleic Acids Res 15: 6131-6148) or a chimeric RNA -DNA analogue (Inoue et al. (1987) 
FEBSLett 2 15: 327-330). 



4.4 RIBOZYMES AND PNA MOIETIES 

30 In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. 

Ribozymes are catalytic RNA molecules with ribonuclease activity that are capable of cleaving a 
single-stranded nucleic acid, such as a mRNA, to which they have a complementary region. 
Thus, ribozymes (e.g., hammerhead ribozymes (described in Haselhoff and Gerlach (1988) 
Nature 334:585-591)) can be used to catalytically cleave a mRNA transcripts to thereby inhibit 

35 translation of a mRNA. A ribozyme having specificity for a nucleic acid of the invention can be 
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designed based upon the nucleotide sequence of a DNA disclosed herein {i.e., SEQ ID NO:l- 
1 350). For example, a derivative of a Tetrahymena L-19 FVS RNA can be constructed in which 
the nucleotide sequence of the active site is complementary to the nucleotide sequence to be 
cleaved in a SECX-encoding mRNA. See, e.g., Cech et al U.S. Pat. No. 4,987,071 ; and Cech et 
al U.S. Pat. No. 5,1 16,742. Alternatively, SECX mRNA can be used to select a catalytic RNA 
having a specific ribonuclease activity from a pool of RNA molecules. See, e.g., Barrel et al, 
( 1 993) Science 261:1411-1418. 

Alternatively, gene expression can be inhibited by targeting nucleotide sequences 
complementary to the regulatory region {e.g., promoter and/or enhancers) to form triple helical 
structures that prevent transcription of the gene in target cells. See generally, Helene. (1991) 
Anticancer Drug Des. 6: 569-84; Helene. et al. (1992) Ann. N. Y. Acad. Sci. 660:27-36; and 
Maher (1992) Bioassays 14: 807-15. 

In various embodiments, the nucleic acids of the invention can be modified at the base 
moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or 
15 solubility of the molecule. For example, the deoxyribose phosphate backbone of the nucleic 
acids can be modified to generate peptide nucleic acids (see Hyrup et al (1996) BioorgMed 
Chem 4: 5-23). As used herein, the terms "peptide nucleic acids" or "PNAs" refer to nucleic acid 
mimics, e.g., DNA mimics, in which the deoxyribose phosphate backbone is replaced by a 
pseudopeptide backbone and only the four natural nucleobases are retained. The neutral 
20 backbone of PNAs has been shown to allow for specific hybridization to DNA and RNA under 
conditions of low ionic strength. The synthesis of PNA oligomers can be performed using 
standard solid phase peptide synthesis protocols as described in Hyrup et al (1996) above; 
Perry-O'Keefe et al (1996) PNAS93: 14670-675. 

PNAs of the invention can be used in therapeutic and diagnostic applications. For 
25 example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of 
gene expression by, e.g., inducing transcription or translation arrest or inhibiting replication. 
PNAs of the invention can also be used, e.g., in the analysis of single base pair mutations in a 
gene by, e.g., PNA directed PCR clamping; as artificial restriction enzymes when used in 
combination with other enzymes, e.g., SI nucleases (Hyrup B. (1996) above); or as probes or 
30 primers for DNA sequence and hybridization (Hyrup et al (1996), above; Perry-O'Keefe (1996), 
above). 

In another embodiment, PNAs of the invention can be modified, e.g., to enhance then- 
stability or cellular uptake, by attaching lipophilic or other helper groups to PNA, by the 
formation of PNA -DNA chimeras, or by the use of liposomes or other techniques of drug 
3 5 delivery known in the art. For example, PNA-DNA chimeras can be generated that may 
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combine the advantageous properties of PNA and DNA. Such chimeras allow DNA recognition 
enzymes, e.g., RNase H and DNA polymerases, to interact with the DNA portion while the PNA 
portion would provide high binding affinity and specificity. PNA-DNA chimeras can be linked 
using linkers of appropriate lengths selected in terms of base stacking, number of bonds between 

5 the nucleobases, and orientation (Hymp (1996) above). The synthesis of PNA-DNA chimeras 
can be performed as described in Hyrup (1996) above and Finn et al. (1996) Nucl Acids Res 24: 
3357-63. For example, a DNA chain can be synthesized on a solid support using standard 
phosphoramidite coupling chemistry, and modified nucleoside analogs, e.g., 
5 , -(4-memoxytrityl)anuno-5'-deoxy-thymidine phosphoramidite, can be used between the PNA 

10 and the 5' end of DNA (Mag et al. (1989) Nucl Acid Res 17: 5973-88). PNA monomers are then 
coupled in a stepwise manner to produce a chimeric molecule with a 5' PNA segment and a 3' 
DNA segment (Finn et al. (1996) above). Alternatively, chimeric molecules can be synthesized 
with a 5' DNA segment and a 3' PNA segment. See, Petersen et al. (1975) Bioorg Med Chem 
Lett 5: 1119-11124. 

1 5 In other embodiments, the oligonucleotide may include other appended groups such as 

peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the 
cell membrane (see, e.g., Letsinger et al., 1989, Proc. Natl Acad Sci. U.S.A. 86:6553-6556; 
Lemaitre et al, 1987, Proc. Natl. Acad. Sci. 84:648-652; PCT Publication No. W088/09810) or 
the blood-brain barrier (see, e.g., PCT Publication No. W089/10134). In addition, 

20 oligonucleotides can be modified with hybridization triggered cleavage agents (See, e.g. , Krol et 
al, 1988, BioTechniques 6:958-976) or intercalating agents. (See, e.g., Zon, 1988, Pharm. Res. 
5: 539-549). To this end, the oligonucleotide may be conjugated to another molecule, e.g., a 
peptide, a hybridization triggered cross-linking agent, a transport agent, a hybridization-triggered 
cleavage agent, etc. 

25 

4.5 HOSTS 

The present invention further provides host cells genetically engineered to contain the 
polynucleotides of the invention. For example, such host cells may contain nucleic acids of the 
invention introduced into the host cell using known transformation, transfection or infection 
30 methods. The present invention still further provides host cells genetically engineered to express 
the polynucleotides of the invention, wherein such polynucleotides are in operative association 
with a regulatory sequence heterologous to the host cell which drives expression of the 
polynucleotides in the cell. 

Knowledge of nucleic acid sequences allows for modification of cells to permit, or 
35 increase, expression of endogenous polypeptide. Cells can be modified (e.g. , by homologous 
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recombination) to provide increased polypeptide expression by replacing, in whole or in part, the 
naturally occurring promoter with all or part of a heterologous promoter so that the cells express 
the polypeptide at higher levels. The heterologous promoter is inserted in such a manner that it 
is operative linked to the encoding sequences. See, for example, PCT International Publication 
5 No. WO94/12650, PCT International Publication No. WOW/20808, and PCT International 

Publication No. WO91/09955. It is also contemplated that, in addition to heterologous promoter 
DNA amplifiable marker DNA {e.g., ada, dhfr, and the multifunctional CAD gene which 
encodes carbamyl phosphate synthase, aspartate transearbamylase, and dihydroorotase) and/or 
intron DNA may be inserted along with the heterologous promoter DNA. If linked to the coding 
10 sequence, amplification of the marker DNA by standard selection methods results in co- 
amplification of the desired protein coding sequences in the cells. 

The host cell can be a higher eukaryotic host cell, such as a mammalian cell, a lower 
eukaryotie host cell, such as a yeast cell, or the host cell can be a prokaryotic cell, such as a 
bacterial cell. Introduction of the recombinant construct into the host cell can be effected by 
1 5 calcium phosphate transfection, DEAE, dextran mediated transection, or electroporation (Davis, 
L et al, Basic Methods in Molecular Biology (1986)). The host cells containing one of the 
polynucleotides of the invention, can be used in conventional manners to produce the gene 
product encoded by the isolated fragment (in the case of an ORE) or can be used to produce a 
heterologous protein under the control of the EMF. 
20 Any host/vector system can be used to express one or more of the ORFs of the present 

invention. These include, but arc not limited to, eukaryotic hosts such as HeLa cells, Cv-1 cell, 
COS cells, 293 cells, and Sf9 cells, as well as prokaryotic host such as E. coli and B. subtilis. 
The most preferred cells are those which do not normally express the particular polypeptide or 
protein or which expresses the polypeptide or protein at low natural level. Mature proteins can 
25 be expressed in mammalian cells, yeast, bacteria, or other cells under the control of appropriate 
promoters. Cell-free translation systems can also be employed to produce such proteins using 
RNAs derived from the DNA constructs of the present invention. Appropriate cloning and 
expression vectors for use with prokaryotic and eukaryotic hosts are described by Sambrook, et 
al., in Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor, New 
30 York (1989), the disclosure of which is hereby incorporated by reference. 

Various mammalian cell culture systems can also be employed to express recombinant 
protein Examples of mammalian expression systems include the COS-7 lines of monkey kidney 
fibroblasts, described by Gluzman, Cell 23:175 (1981). Other cell lines capable of expressing a 
compatible vector are, for example, the C127, monkey COS cells, Chinese Hamster Ovary 
3< (CHO) cells, human kidney 293 cells, human epidermal A431 cells, human Colo205 cells, 3T, 
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cells, CV-1 cells, other transformed primate cell lines, normal diploid cells, cell strains derived 
from in vitro culture of primary tissue, primary explants, HeLa cells, mouse L cells, BHK, 
HL-60, U937, HaK or Jurkat cells. Mammalian expression vectors will comprise an origin of 
replication, a suitable promoter and also any necessary ribosome binding sites, polyadenylation 
site, splice donor and acceptor sites, transcriptional termination sequences, and 5' flanking 
nontranscnbed sequences. DNA sequences derived from the SV40 viral genome, for example, 
SV40 origin, early promoter, enhancer, splice, and polyadenylation sites may be used to provide 
the required nontranscnbed genetic elements. Recombinant polypeptides and proteins produced 
in bacterial culture are usually isolated by initial extraction from cell pellets, followed by one or 
more salting-out, aqueous ion exchange or size exclusion chromatography steps. Protein 
refolding steps can be used, as necessary, in completing configuration of the mature protein. 
Finally, high performance liquid chromatography (HPLC) can be employed for final purification 
steps. Microbial cells employed in expression of proteins can be disrupted by any convenient 
method, including freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing 
agents. 

Alternatively, it may be possible to produce the protein in lower eukaryotes such as yeast 
or insects or in prokaryotes such as bacteria. Potentially suitable yeast strains include 
Saccharomyces cerevisiae, Schizosaccharomyces pombe, Kluyveromyces strains, Candida, or 
any yeast strain capable of expressing heterologous proteins. Potentially suitable bacterial 
strains include Escherichia coli. Bacillus subtilis, Salmonella typhimurium, or any bacterial 
strain capable of expressing heterologous proteins. If the protein is made in yeast or bacteria, it 
may be necessary to modify the protein produced therein, for example by phosphorylation or 
glycosylate of the appropriate sites, in order to obtain the functional protein. Such covalent 
attachments may be accomplished using known chemical or enzymatic methods. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 
inducible regulatory elements, in which case the regulatory sequences of the endogenous gene 
may be replaced by homologous recombination. As described herein, gene targeting can be used 
to replace a gene's existing regulatory region with a regulatory sequence isolated from a different 
gene or a novel regulatory sequence synthesized by genetic engineering methods. Such 
regulatory sequences may be comprised of promoters, enhancers, scaffold-attachment regions, 
negative regulatory elements, transcriptional initiation sites, regulatory protein binding sites or 
combinations of said sequences. Alternatively, sequences which affect the structure or stability 
of the RNA or protein produced may be replaced, removed, added, or otherwise modified by 
targeting. These sequence include polyadenylation signals, mRNA stability elements, splice 
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sites, leader sequences for enhancing or modifying transport or secretion properties of the 
protein, or other sequences which alter or improve the function or stability of protein or RNA 
molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the 

5 gene under the control of the new regulatory sequence, e.g., inserting a new promoter or 

enhancer or both upstream of a gene. Alternatively, the targeting event may be a simple deletion 
of a regulatory element, such as the deletion of a tissue-specific negative regulatory element. 
Alternatively, the targeting event may replace an existing element; for example, a tissue-specific 
enhancer can be replaced by an enhancer that has broader or different cell-type specificity than 

1 0 the naturally occurring elements. Here, the naturally occurring sequences are deleted and new 
sequences are added. In all cases, the identification of the targeting event may be facilitated by 
the use of one or more selectable marker genes that are contiguous with the targeting DNA, 
allowing for the selection of cells in which the exogenous DNA has integrated into the host cell 
genome. The identification of the targeting event may also be facilitated by the use of one or 

15 more marker genes exhibiting the property of negative selection, such that the negatively 
selectable marker is linked to the exogenous DNA, but configured such that the negatively 
selectable marker flanks the targeting sequence, and such that a correct homologous 
recombination event with sequences in the host cell genome does not result in the stable 
integration of the negatively selectable marker. Markers useful for this purpose include the 

20 Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial xanthine-guanine 
phosphoribosyl-transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance with 
this aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to 
Chappel; U.S. Patent No. 5,578,461 to Sherwin et al.; International Application No. 

25 PCT/US92/09627 (WO93/09222) by Selden et al.; and International Application No. 

PCT/US90/06436 (W09 1/06667) by Skoultchi et al., each of which is incorporated by reference 
herein in its entirety. 

4.6 POLYPEPTIDES OF THE INVENTION 

30 The isolated polypeptides of the invention include, but are not limited to, a polypeptide 

comprising: the amino acid sequences set forth as any one of SEQ ID NO.-1351-2700 or an 
amino acid sequence encoded by any one of the nucleotide sequences SEQ ID NO: 1-1 350 or the 
corresponding full length or mature protein. Polypeptides of the invention also include 
polypeptides preferably with biological or immunological activity that are encoded by: (a) a 

35 polynucleotide having any one of the nucleotide sequences set forth in SEQ ID NO:1-1350 or (b) 
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polynucleotides encoding any one of the amino acid sequences set forth as SEQ ID NO: 1351- 
2700 or (c) polynucleotides that hybridize to the complement of the polynucleotides of either (a) 
or (b) under stringent hybridization conditions. The invention also provides biologically active 
or immunologically active variants of any of the ammo acid sequences set forth as SEQ ID 
NOT351-2700 or the corresponding Ml length or mature protein; and "substantial equivalents" 
thereof (e.g., with at least about 65%, at least about 70%, at least about 75%, at least about 80%, 
at least about 85%, 86% 87%, 88%, 89%, at least about 90%, 91%, 92%, 93%, 94%, typically at 
least about 95%, 96%, 97%, more typically at least about 98%, or most typically at least about 
99% amino acid identity) that retain biological activity. Polypeptides encoded by allelic variants 
may have a similar, increased, or decreased activity compared to polypeptides comprising SEQ 
ID NO:l 35 1-2700. 

Fragments of the proteins of the present invention which are capable of exhibiting 
biological activity are also encompassed by the present invention. Fragments of the protein may 
be in linear form or they may be cyclized using known methods, for example, as described in H. 
U. Saragovi, et al., Bio/Technology 10, 773-778 (1992) and in R. S. McDowell, et al., J. Amer. 
Chem. Soc. 1 14, 9245-9253 (1992), both of which are incorporated herein by reference. Such 
fragments may be fused to carrier molecules such as immunoglobulins for many purposes, 
including increasing the valency of protein binding sites. 

The present invention also provides both full-length and mature forms (for example, 
without a signal sequence or precursor sequence) of the disclosed proteins. The protein coding 
sequence is identified in the sequence listing by translation of the disclosed nucleotide 
sequences. The mature form of such protein may be obtained by expression of a full-length 
polynucleotide in a suitable mammalian cell or other host cell. The sequence of the mature form 
of the protein is also determinable from the amino acid sequence of the full-length form. Where 
proteins of the present invention are membrane bound, soluble forms of the proteins are also 
provided. In such forms, part or all of the regions causing the proteins to be membrane bound 
are deleted so that the proteins are fully secreted from the cell in which they are expressed. 

Protein compositions of the present invention may further comprise an acceptable carrier, 
such as a hydrophilic, e.g., pharmaceutical^ acceptable, carrier. 

The present invention further provides isolated polypeptides encoded by the nucleic acid 
fragments of the present invention or by degenerate variants of the nucleic acid fragments of the 
present invention. By "degenerate variant'- is intended nucleotide fragments which differ from a 
nucleic acid fragment of the present invention (e.g., an ORF) by nucleotide sequence but, due to 
the degeneracy of the genetic code, encode an identical polypeptide sequence. Preferred nucleic 
acid fragments of the present invention are the ORFs that encode proteins. 
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A variety of methodologies known in the art can be utilized to obtain any one of the 
isolated polypeptides or proteins of the present invention. At the simplest level, the amino acid 
sequence can be synthesized using commercially available peptide synthesizers. The 
synthetically-constructed protein sequences, by virtue of sharing primary, secondary or tertiary 
5 structural and/or conformational characteristics with proteins may possess biological properties 
in common therewith, including protein activity. This technique is particularly useful in 
producing small peptides and fragments of larger polypeptides. Fragments are useful, for 
example, in generating antibodies against the native polypeptide. Thus, they may be employed 
as biologically active or immunological substitutes for natural, purified proteins in screening of 
10 therapeutic compounds and in immunological processes for the development of antibodies. 

The polypeptides and proteins of the present invention can alternatively be purified from 
cells which have been altered to express the desired polypeptide or protein. As usfed herein, a 
cell is said to be altered to express a desired polypeptide or protein when the cell, through genetic 
manipulation, is made to produce a polypeptide or protein which it normally does not produce or 
1 5 which the cell normally produces at a lower level. One skilled in the art can readily adapt 
procedures for introducing and expressing either recombinant or synthetic sequences into 
eukaryotic or prokaryotic cells in order to generate a cell which produces one of the polypeptides 
or proteins of the present invention. 

The invention also relates to methods for producing a polypeptide comprising growing a 
20 culture of host cells of the invention in a suitable culture medium, and purifying the protein from 
the cells or the culture in which the cells are grown. For example, the methods of the invention 
include a process for producing a polypeptide in which a host cell containing a suitable 
expression vector that includes a polynucleotide of the invention is cultured under conditions that 
allow expression of the encoded polypeptide. The polypeptide can be recovered from the 
25 culture, conveniently from the culture medium, or from a lysate prepared from the host cells and 
further purified. Preferred embodiments include those in which the protein produced by such 
process is a full length or mature form of the protein. 

In an alternative method, the polypeptide or protein is purified from bacterial cells which 
naturally produce the polypeptide or protein. One skilled in the art can readily follow known 
30 methods for isolating polypeptides and proteins in order to obtain one of the isolated 
polypeptides or proteins of the present invention. These include, but are not limited to, 
immunochromatography, HPLC, size-exclusion chromatography, ion-exchange chromatography, 
and immuno-affinity chromatography. See, e.g., Scopes, Protein Purification: Principles and 
Practice, Springer-Verlag (1994); Sambrook, et al., in Molecular Cloning: A Laboratory 
35 Manual; Ausubel et al., Current Protocols in Molecular Biology. Polypeptide fragments that 
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retain biological/imraunological activity include fragments comprising greater than about 100 
amino acids, or greater than about 200 amino acids, and fragments that encode specific protein 
domains. 

The purified polypeptides can be used in in vitro binding assays which are well known in 
5 the art to identify molecules which bind to the polypeptides. These molecules include but are not 
limited to, for e.g. , small molecules, molecules fiom combinatorial libraries, antibodies or other 
proteins. The molecules identified in the binding assay are then tested for antagonist or agonist 
activity in in vivo tissue culture or animal models that are well known in the art. In brief, the 
molecules are titrated into a plurality of cell cultures or animals and then tested for either 
10 cell/animal death or prolonged survival of the animal/cells. 

In addition, the peptides of the invention or molecules capable of binding to the peptides 
may be complexed with toxins, e.g. , ricin or cholera, or with other compounds that are toxic to 
cells. The toxin-binding molecule complex is then targeted to a tumor or other cell by the 
specificity of the binding molecule for SEQ ID NO:13 51-2700. 
1 5 The protein of the invention may also be expressed as a product of transgenic animals, 

e.g., as a component of the milk of transgenic cows, goats, pigs, or sheep which are characterized 
by somatic or germ cells containing a nucleotide sequence encoding the protein. 

The proteins provided herein also include proteins characterized by amino acid sequences 
similar to those of purified proteins but into which modification are naturally provided or 
20 deliberately engineered. For example, modifications, in the peptide or DNA sequence, can be 
made by those skilled in the art using known techniques. Modifications of interest in the protein 
sequences may include the alteration, substitution, replacement, insertion or deletion of a 
selected amino acid residue in the coding sequence. For example, one or more of the cysteine 
residues may be deleted or replaced with another amino acid to alter the conformation of the 
25 molecule. Techniques for such alteration, substitution, replacement, insertion or deletion are 
well known to those skilled in the art (see, e.g., U.S. Pat. No. 4,518,584). Preferably, such 
alteration, substitution, replacement, insertion or deletion retains the desired activity of the 
protein. Regions of the protein that are important for the protein function can be determined by 
various methods known in the art including the alanine-scanning method which involved 
30 systematic substitution of single or strings of amino acids with alanine, followed by testing the 
resulting alanine-containing variant for biological activity. This type of analysis determines the 
importance of the substituted amino acid(s) in biological activity. Regions of the protein that are 
important for protein function may be determined by the eMATRTX program. 

Other fragments and derivatives of the sequences of proteins which would be expected to 
3 5 retain protein activity in whole or in part and are useful for screening or other immunological 
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methodologies may also be easily made by those skilled in the art given the disclosures herein. 
Such modifications are encompassed by the present invention. 

The protein may also be produced by operably linking the isolated polynucleotide of the 
invention to suitable control sequences in one or more insect expression vectors, and employing 
an insect expression system. Materials and methods for baculovirus/insect cell expression 
systems are commercially available in kit form from, e.g., Invitrogen, San Diego, Calif., U.S.A. 
(the MaxBat™ kit), and such methods are well known in the art, as described in Summers and 
Smith, Texas Agricultural Experiment Station Bulletin No. 1555 (1987), incorporated herein by 
reference. As used herein, an insect cell capable of expressing a polynucleotide of the present 
invention is "transformed." 

The protein of the invention may be prepared by culturing transformed host cells under 
culture conditions suitable to express the recombinant protein. The resulting expressed protein 
may then be purified from such culture (i.e., from culture medium or cell extracts) using known 
purification processes, such as gel filtration and ion exchange chromatography. The purification 
of the protein may also include an affinity column containing agents which will bind to the 
protein; one or more column steps over such affinity resins as concanavalin A-agarose, 
heparin-toyopearl™ or Cibacrom blue 3GA Sepharose™; one or more steps involving 
hydrophobic interaction chromatography using such resins as phenyl ether, butyl ether, or propyl 
ether; or immunoaffinity chromatography. 

Alternatively, the protein of the invention may also be expressed in a form which will 
facilitate purification. For example, it may be expressed as a fusion protein, such as those of 
maltose binding protein (MBP), glutathione-S-transferase (GST) or thioredoxin (TRX), or as a 
His tag. Kits for expression and purification of such fusion proteins are commercially available 
from New England BioLab (Beverly, Mass.), Pharmacia (Piscataway, N.J.) and Invitrogen, 
respectively. The protein can also be tagged with an epitope and subsequently purified by using 
a specific antibody directed to such epitope. One such epitope ("FLAG®") is commercially 
available from Kodak (New Haven, Conn.). 

Finally, one or more reverse-phase high performance liquid chromatography (RP- HPLC) 
steps employing hydrophobic RP-HPLC media, e.g., silica gel having pendant methyl or other 
aliphatic groups, can be employed to further purify the protein. Some or all of the foregoing 
purification steps, in various combinations, can also be employed to provide a substantially 
homogeneous isolated recombinant protein. The protein thus purified is substantially free of 
other mammalian proteins and is defined in accordance with the present invention as an "isolated 
protein." 
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The polypeptides of the invention include analogs (variants). This embraces fragments, 
as well as peptides in which one or more amino acids has been deleted, inserted, or substituted. 
Also, analogs of the polypeptides of the invention embrace fusions of the polypeptides or 
modifications of the polypeptides of the invention, wherein the polypeptide or analog is fused to 
another moiety or moieties, e.g. , targeting moiety or another therapeutic agent. Such analogs 
may exhibit improved properties such as activity and/or stability. Examples of moieties which 
may be fused to the polypeptide or an analog include, for example, targeting moieties which 
provide for the delivery of polypeptide to pancreatic cells, e.g., antibodies to pancreatic cells, 
antibodies to immune cells such as T-cells, monocytes, dendritic cells, granulocytes, etc., as well 
as receptor and ligands expressed on pancreatic or immune cells. Other moieties which may be 
fused to the polypeptide include therapeutic agents which are used for treatment, for example, 
immunosuppressive drugs such as cyclosporin, SK506, azathioprine, CD3 antibodies and 
steroids. Also, polypeptides may be fused to immune modulators, and other cytokines such as 
alpha or beta interferon. 

4.6.1 DETERMINING POLYPEPTIDE AND POLYNUCLEOTIDE IDENTITY 

AND SIMILARITY 

Preferred identity and/or similarity are designed to give the largest match between the 
sequences tested. Methods to determine identity and similarity are codified in computer 
programs including, but are not limited to, the GCG program package, including GAP 
(Devereux, J., et al., Nucleic Acids Research 12(1):387 (1984); Genetics Computer Group, 
University of Wisconsin, Madison, WI), BLASTP, BLASTN, BLASTX, FASTA (Altschul, S.F. 
et al., J. Molec. Biol. 215:403-410 (1990), PSI-BLAST (Altschul S.F. et al., Nucleic Acids Res. 
vol. 25, pp. 3389-3402, herein incorporated by reference), eMatrix software (Wu et al., J. Comp. 
Biol., Vol. 6, pp. 219-235 (1999), herein incorporated by reference), eMotif software (Nevill- 
Manning et al, ISMB-97, Vol. 4, pp. 202-209, herein incorporated by reference), pFam software 
(Sonnhammer et al., Nucleic Adds Res., Vol. 26(1), pp. 320-322 (1998), herein incorporated by 
reference) and the Kyte-Doolittle hydrophobocity prediction algorithm (J. Mol Biol, 157, pp. 
105-31 (1982), incorporated herein by reference). The BLAST programs are publicly available 
from the National Center for Biotechnology Information (NCBl) and other sources (BLAST 
Manual, Altschul, S., et al. NCB NLM NIH Bethesda, MD 20894; Altschul, S., et al., J. Mol. 
Biol. 215:403-410 (1990). 

4.7 CHIMERIC AND FUSION PROTEINS 

The invention also provides chimeric or fusion proteins. As used herein, a "chimeric 
protein" or "fusion protein" comprises a polypeptide of the invention operatively linked to 
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another polypeptide. Within a fusion protein the polypeptide according to the invention can 
correspond to all or a portion of a protein according to the invention. In one embodiment, a 
fusion protein comprises at least one biologically active portion of a protein according to the 
invention. In another embodiment, a fusion protein compnses at least two biologically active 
portions of a protein according to the invention. Within the fusion protein, the term "operatively 
linked" is intended to indicate that the polypeptide according to the invention and the other 
polypeptide are fused in-frame to each other. The polypeptide can be fused to the N-tenninus or 
C-terminus. 

For example, in one embodiment a fusion protein comprises a polypeptide according to 
the invention operably linked to the extracellular domain of a second protein. 

In another embodiment, the fusion protein is a GST-fusion protein in which the 
polypeptide sequences of the invention are fused to the C-terminus of the GST (i.e., glutathione 

S-transferase) sequences. 

In another embodiment, the fusion protein is an immunoglobulin fusion protein in which 
the polypeptide sequences according to the invention comprises one or more domains are fused 
to sequences derived from a member of the immunoglobulin protein family. The 
immunoglobulin fusion proteins of the invention can be incorporated into pharmaceutical 
compositions and administered to a subject to inhibit an interaction between a ligand and a 
protein of the invention on the surface of a cell, to thereby suppress signal transduction in vivo. 
The immunoglobulin fusion proteins can be used to affect the bioavailability of a cognate ligand. 
Inhibition of the ligand/protein interaction may be useful therapeutically for both the treatment of 
proliferative and differentiative disorders, e.g., cancer as well as modulating {e.g., promoting or 
inhibiting) cell survival. Moreover, the immunoglobulin fusion proteins of the invention can be 
used as immunogens to produce antibodies in a subject, to purify ligands, and in screening assays 
to identify molecules that inhibit the interaction of a polypeptide of the invention with a ligand. 

A chimeric or fusion protein of the invention can be produced by standard recombinant 
DNA techniques. For example, DNA fragments coding for the different polypeptide sequences 
are ligated together in-frame in accordance with conventional techniques, e.g., by employing 
blunt-ended or stagger-ended termini for ligation, restriction enzyme digestion to provide for 
appropriate termini, filling-in of cohesive ends as appropriate, alkaline phosphatase treatment to 
avoid undesirable joining, and enzymatic ligation. In another embodiment, the fusion gene can 
be synthesized by conventional techniques including automated DNA synthesizers. 
Alternatively, PCR amplification of gene fragments can be carried out using anchor primers that 
give rise to complementary overhangs between two consecutive gene fragments that can 
subsequently be annealed and reamplified to generate a chimeric gene sequence (see, for 
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example, Ausubel et al. (eds.) Current Protocols in Molecular Biology, John Wiley & 
Sons, 1992). Moreover, many expression vectors are commercially available that already encode 
a fusion moiety (e.g., a GS T polypeptide). A nucleic acid encoding a polypeptide of the 
invention can be cloned into such an expression vector such that the fusion moiety is linked 
5 in-frame to the protein of the invention. 

4.8 GENE THERAPY 

Mutations in the polynucleotides of the invention gene may result in loss of normal 
function of the encoded protein. The invention thus provides gene therapy to restore normal 

10 activity of the polypeptides of the invention; or to treat disease states involving polypeptides of 
the invention. Delivery of a functional gene encoding polypeptides of the invention to 
appropriate cells is effected ex vivo, in situ, or in vivo by use of vectors, and more particularly 
viral vectors (e.g., adenovirus, adeno-associated virus, or a retrovirus), or ex vivo by use of 
physical DNA transfer methods (e.g., liposomes or chemical treatments). See, for example, 

15 Anderson, Nature, supplement to vol. 392, no. 6679, pp.25-20 (1998). For additional reviews of 
gene therapy technology see Friedmann, Science, 244: 1275-1281 (1989); Verma, Scientific 
American: 68-84 (1990); and Miller, Nature, 357: 455-460 (1992). Introduction of any one of 
the nucleotides of the present invention or a gene encoding the polypeptides of the present 
invention can also be accomplished with extrachromosomal substrates (transient expression) or 

20 artificial chromosomes (stable expression). Cells may also be cultured ex vivo in the presence of 
proteins of the present invention in order to proliferate or to produce a desired effect on or 
activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 
Alternatively, it is contemplated that in other human disease states, preventing the expression of 
or inhibiting the activity of polypeptides of the invention will be useful in treating the disease 

25 states. It is contemplated that antisense therapy or gene therapy could be applied to negatively 
regulate the expression of polypeptides of the invention. 

Other methods inhibiting expression of a protein include the introduction of antisense 
molecules to the nucleic acids of the present invention, their complements, or their translated RNA 
sequences, by methods known in the art. Further, the polypeptides of the present invention can be 

30 inhibited by using targeted deletion methods, or the insertion of a negative regulatory element such 
as a silencer, which is tissue specific. 

The present invention still further provides cells genetically engineered in vivo to express the 
polynucleotidesof the invention, wherein such polynucleotides are in operative association with a 
regulatory sequence heterologous to the host cell which drives expression of the polynucleotides in 
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the cell. These methods can be used to increase or decrease the expression of the polynucleotides of 
the present invention. 

Knowledge of DNA sequences provided by the invention allows for modification of cells to 
permit, increase, or decrease, expression of endogenous polypeptide. CeUs can be modified {e.g. , by 
5 homologous recombination) to provide increased polypeptide expression by replacing, in whole or 
in part, the naturally occurring promoter with all or part of a heterologous promoter so that the cells 
express the protein at higher levels. The heterologous promoter is inserted in such a manner that it is 
operatively linked to the desired protein encoding sequences. See, for example, PCT International 
Publications. WO 94/12650, PCT International PublicationNo. WO 92/20808, and PCT 
10 IntemationalPublicationNo. WO 91/09955. It is also contemplated that, in addition to heterologous 
promoter DNA, amplifiable marker DNA {e.g. , ada, dhfi, and the multifunctional CAD gene which 
encodes carbamyl phosphate synthase, aspartate transcarbamylase, and dihydroorotase) and/or 
intron DNA may be inserted along with the heterologous promoter DNA. If linked to the desired 
protein coding sequence, amplification of the marker DNA by standard selection methods results in 
15 co-amplificationof the desired protein coding sequences in the cells. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 
inducible regulatory elements, in which case the regulatory sequences of the endogenous gene may 
be replaced by homologous recombination. As described herein, gene targeting can be used to 
20 replace a gene's existing regulatory region with a regulatory sequence isolated from a different gene 
or a novel regulatory sequence synthesized by genetic engineering methods. Such regulatory 
sequences may be comprised of promoters, enhancers, scaffold-attachment regions, negative 
regulatory elements, transcriptional initiation sites, regulatory protein binding sites or combinations 
of said sequences. Alternatively, sequences which affect the structure or stability of the RNA or 
25 protein produced may be replaced, removed, added, or otherwise modified by targeting. These 

sequences include polyadenylation signals, mRNA stability elements, splice sites, leader sequences 
for enhancing or modifying transport or secretion properties of the protein, or other sequences 
which alter or improve the function or stability of protein or RNA molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the gene 
30 under the control of the new regulatory sequence, e.g., inserting a new promoter or enhancer or both 
upstream of a gene. Alternatively, the targeting event may be a simple deletion of a regulatory 
element, such as the deletion of a tissue-specific negative regulatory element. Alternatively, the 
targeting event may replace an existing element; for example, a tissue-specific enhancer can be 
replaced by an enhancer that has broader or different cell-type specificity than the naturally 
3 5 occurring elements. Here, the naturally occurring sequences are deleted and new sequences are 
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added. In all cases, the identification of the targeting event may be facilitated by the use of one or 
more selectable marker genes that are contiguous with the targeting DNA, allowing for the selection 
of cells in which the exogenous DNA has integrated into the cell genome. The identification of the 
targeting event may also be facilitated by the use of one or more marker genes exhibiting the 

5 property of negative selection, such that the negatively selectable marker is linked to the exogenous 
DNA, but configured such that the negatively selectable marker flanks the targeting sequence, and 
such that a correct homologous recombination event with sequences in the host cell genome does 
not result in the stable integration of the negatively selectable marker. Markers useful for this 
purpose include the Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial 

1 0 xanthine-guanine phosphoribosyl-transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance with this 
aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to Chappel; 
U.S. Patent No. 5,578,461 to Sherwin et al.; International Application No. PCT/US92/09627 
(WO93/09222)by Selden et al.; and International Application No. PCT/US90/06436 

1 5 (W09 1 /06667) by Skoultchi et al., each of which is incorporated by reference herein in its entirety. 

4.9 TRANSGENIC ANIMALS 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 

20 inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination are 
referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 

25 prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 
processes, and preferably in disease states. Transgenic animals are useful as model systems to 
identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 

30 Publication No. W094/28 1 22, incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of a promoter of the 
polynucleotides of the invention is either activated or inactivated to alter the level of expression 
of the polypeptides of the invention. Inactivation can be carried out using homologous 
recombination methods described above. Activation can be achieved by supplementing or even 

35 replacing the homologous promoter to provide for increased protein expression. The homologous 
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promoter can be supplemented by insertion of one or more heterologous enhancer elements 
known to confer promoter activation in a particular tissue. 

The polynucleotides of the present invention also make possible the development, 
through, e.g., homologous recombination or knock out strategies, of animals that fail to express 
5 polypeptides of the invention or that express a variant polypeptide. Such animals are useful as 
models for studying the in vivo activities of polypeptide as well as for studying modulators of the 

polypeptides of the invention. 

in preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 

10 inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination are 
referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 

15 prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 
processes, and preferably in disease states. Transgenic animals are useful as model systems to 
identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 

20 Publication No. W094/28122, incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of the polynucleotides of the 
invention promoter is either activated or inactivated to alter the level of expression of the 
polypeptides of the invention. Inactivation can be carried out using homologous recombination 
methods described above. Activation can be achieved by supplementing or even replacing the 

25 homologous promoter to provide for increased protein expression. The homologous promoter 
can be supplemented by insertion of one or more heterologous enhancer elements known to 
confer promoter activation in a particular tissue. 

4.10 USES AND BIOLOGICAL ACTIVITY 

30 The polynucleotides and proteins of the present invention are expected to exhibit one or 

more of the uses or biological activities (including those associated with assays cited herein) 
identified herein. Uses or activities described for proteins of the present invention may be 
provided by administration or use of such proteins or of polynucleotides encoding such proteins 
(such as, for example, in gene therapies or vectors suitable for introduction of DNA). The 

35 mechanism underlying the particular condition or pathology will dictate whether the 
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polypeptides of the invention, the polynucleotides of the invention or modulators (activators or 
inhibitors) thereof would be beneficial to the subject in need of treatment. Thus, "therapeutic 
compositions of the invention" include compositions comprising isolated polynucleotides 
(including recombinant DNA molecules, cloned genes and degenerate variants thereof) or 

5 polypeptides of the invention (including full length protein, mature protein and truncations or 
domains thereof), or compounds and other substances that modulate the overall activity of the 
target gene products, either at the level of target gene/protein expression or target protein 
activity. Such modulators include polypeptides, analogs, (variants), including fragments and 
fusion proteins, antibodies and other binding proteins; chemical compounds that directly or 

10 indirectly activate or inhibit the polypeptides of the invention (identified, e.g. , via drug screening 
assays as described herein); antisense polynucleotides and polynucleotides suitable for triple 
helix formation; and in particular antibodies or other binding partners that specifically recognize 
one or more epitopes of the polypeptides of the invention. 

The polypeptides of the present invention may likewise be involved in cellular activation 

15 or in one of the other physiological pathways described herein. 

4.10.1 RESEARCH USES AND UTILITIES 

The polynucleotides provided by the present invention can be used by the research 
' community for various purposes. The polynucleotides can be used to express recombinant 

20 protein for analysis, characterization or therapeutic use; as markers for tissues in which the 

corresponding protein is preferentially expressed (either constitutively or at a particular stage of 
tissue differentiation or development or in disease states); as molecular weight markers on gels; 
as chromosome markers or tags (when labeled) to identify chromosomes or to map related gene 
positions; to compare with endogenous DNA sequences in patients to identify potential genetic 

25 disorders; as probes to hybridize and thus discover novel, related DNA sequences; as a source of 
information to derive PCR primers for genetic fingerprinting; as a probe to "subtract-out" known 
sequences in the process of discovering other novel polynucleotides; for selecting and making 
oligomers for attachment to a "gene chip" or other support, including for examination of 
expression patterns; to raise anti-protein antibodies using DNA immunization techniques; and as 

30 an antigen to raise anti-DNA antibodies or elicit another immune response. Where the 

polynucleotide encodes a protein which binds or potentially binds to another protein (such as, for 
example, in a receptor-ligand interaction), the polynucleotide can also be used in interaction trap 
assays (such as, for example, that described in Gyuris et ah, Cell 75:791-803 (1993)) to identify 
polynucleotides encoding the other protein with which binding occurs or to identify inhibitors of 

35 the binding interaction. 
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The polypeptides provided by the present invention can similarly be used in assays to 
determine biological activity, including in a panel of multiple proteins for high-throughput 
screening; to raise antibodies or to elicit another immune response; as a reagent (including the 
labeled reagent) in assays designed to quantitatively determine levels of the protein (or its 
5 receptor) in biological fluids; as markers for tissues in which the corresponding polypeptide is 
preferentially expressed (either constitutively or at a particular stage of tissue differentiation or 
development or in a disease state); and, of course, to isolate correlative receptors or ligands. 
Proteins involved in these binding interactions can also be used to screen for peptide or small 
molecule inhibitors or agonists of the binding interaction. 
10 Any or all of these research utilities are capable of being developed into reagent grade or 

kit format for commercialization as research products. 

Methods for performing the uses listed above are well known to those skilled in the art. 
References disclosing such methods include without limitation "Molecular Cloning: A 
Laboratory Manual", 2d ed., Cold Spring Harbor Laboratory Press, Sambrook, J., E. F. Fritsch 
15 and T. Maniatis eds., 1 989, and "Methods in Enzymology: Guide to Molecular Cloning 
Techniques", Academic Press, Berger, S. L. and A. R. Kimmel eds., 1987. 

4.10.2 NUTRITIONAL USES 

Polynucleotidesand polypeptides of the present invention can also be used as nutritional 
20 sources or supplements. Such uses include without limitation use as a protein or amino acid 

supplement, use as a carbon source, use as a nitrogen source and use as a source of carbohydrate. In 
such cases the polypeptide or polynucleotide of the invention can be added to the feed of a 
particular organism or can be administered as a separate solid or liquid preparation, such as in the 
form of powder, pills, solutions, suspensions or capsules. In the case of microorganisms,the 
25 polypeptide or polynucleotide of the invention can be added to the medium in or on which the 
microorganism is cultured. 

4.10.3 CYTOKINE AND CELL PROLIFERATION/DIFFERENTIATION 
ACTIVITY 

30 A polypeptide of the present invention may exhibit activity relating to cytokine, cell 

proliferation (either inducing or inhibiting) or cell differentiation (either inducing or inhibiting) 
activity or may induce production of other cytokines in certain cell populations. A 
polynucleotide of the invention can encode a polypeptide exhibiting such attributes. Many 
protein factors discovered to date, including all known cytokines, have exhibited activity in one 

35 or more factor-dependent cell proliferation assays, and hence the assays serve as a convenient 
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confirmation of cytokine activity. The activity of therapeutic compositions of the present 
invention is evidenced by any one of a number of routine factor dependent cell proliferation 
assays for cell lines including, without limitation, 32D, DA2, DA1G, T10, B9, B9/1 1, BaF3, 

MC9/G, M+(preB M+), 2E8, RB5, DAI, 123, Tl 165, HT2, CTLL2, TF-1, Mo7e, CMK, 
5 HUVEC, and Caco. Therapeutic compositions of the invention can be used in the following: 
Assays for T-cell or thymocyte proliferation include without limitation those described 

in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. 

M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 

In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in 
10 Humans); Takai et al., J. Immunol. 137:3494-3500, 1986; Bertagnolli et al., J. Immunol. 

145:1706-1712, 1990; Bertagnolli etal., Cellular Immunology 133:327-341, 1991; Bertagnolli, 

et al., I. Immunol. 149:3778-3783, 1992; Bowman et al., I. Immunol. 152:1756-1761, 1994. 

Assays for cytokine production and/or proliferation of spleen cells, lymph node cells or 

thymocytes include, without limitation, those described in: Polyclonal T cell stimulation, 
1 5 Kruisbeek, A. M. and Shevach, E. M. In Current Protocols in Immunology. J. E. e.a. Coligan 

eds. Vol 1 pp. 3.12.1-3.12.14, John Wiley and Sons, Toronto. 1994; and Measurement of mouse 

and human interleukin-y, Schreiber, R. D. In Current Protocols in Immunology. J. E. e.a. Coligan 

eds. Vol 1 pp. 6.8.1-6.8.8, John Wiley and Sons, Toronto. 1994. 

Assays for proliferation and differentiation of hematopoietic and lymphopoietic cells 
20 include, without limitation, those described in: Measurement of Human and Murine Interleukin 2 

and Interleukin 4, Bottomly, K., Davis, L. S. and Lipsky, P. E. In Current Protocols in 

Immunology. J. E. e.a. Coligan eds. Vol 1 pp. 6.3.1-6.3.12, John Wiley and Sons, Toronto. 1991; 

deVriesetal., J. Exp. Med. 173:1205-1211, 1991; Moreau et al., Nature 336:690-692, 1988; 

Greenberger et al., Proc. Natl. Acad. Sci. U.S.A. 80:2931-2938, 1983; Measurement of mouse 
25 and human interleukin 6-Nordan, R. In Current Protocols in Immunology. J. E. Coligan eds. Vol 

1 pp. 6.6.1-6.6.5, John Wiley and Sons, Toronto. 1991; Smith et al., Proc. Natl. Aced. Sci. 

U.S.A. 83:1857-1861, 1986; Measurement of human Interleukin 1 1 -Bennett, F., Giannotti, J., 

Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. J. E. Coligan eds. Vol 1 pp. 

6.15.1 John Wiley and Sons, Toronto. 1991; Measurement of mouse and human Interleukin 
30 9-Ciarletta : A., Giannotti, J., Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. 

J. E. Coligan eds. Vol 1 pp. 6.13.1, John Wiley and Sons, Toronto. 1991. 

Assays for T-cell clone responses to antigens (which will identify, among others, proteins 

that affect APC-T cell interactions as well as direct T-cell effects by measuring proliferation and 

cytokine production) include, without limitation, those described in: Current Protocols in 
3 5 Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. Shevach, W Strober, 
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Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, In Vitro assays for Mouse 
Lymphocyte Function; Chapter 6, Cytokines and their cellular receptors; Chapter 7, 
Immunologic studies in Humans); Wemberger et al., Proc. Natl. Acad. Sci. USA 77:6091-6095, 
1980; Weinberger et al., Eur. J. Immun. 1 1 ".405-41 1, 1981; Takai et al., J. Immunol. 
5 137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 1988. 

4.10.4 STEM CELL GROWTH FACTOR ACTIVITY 

A polypeptide of the present invention may exhibit stem cell growth factor activity and 
be involved in the proliferation, differentiation and survival of pluripotent and totipotent stem 

10 cells including primordial germ cells, embryonic stem cells, hematopoietic stem cells and/or 
germ line stem cells. Administration of the polypeptide of the invention to stem cells in vivo or 
ex vivo is expected to maintain and expand cell populations in a totipotential or pluripotential 
state which would be useful for re-engineering damaged or diseased tissues, transplantation, 
manufacture of bio-pharmaceuticals and the development of bio-sensors. The ability to produce 

15 large quantities of human cells has important working applications for the production of human 
proteins which currently must be obtained from non-human sources or donors, implantation of 
cells to treat diseases such as Parkinson's, Alzheimer's and other neurodegenerative diseases; 
tissues for grafting such as bone marrow, skin, cartilage, tendons, bone, muscle (including 
cardiac muscle), blood vessels, cornea, neural cells, gastrointestinal cells and others; and organs 

20 for transplantation such as kidney, liver, pancreas (including islet cells), heart and lung. 

It is contemplated that multiple different exogenous growth factors and/or cytokines may 
be administered in combination with the polypeptide of the invention to achieve the desired 
effect, including any of the growth factors listed herein, other stem cell maintenance factors, and 
specifically including stem cell factor (SCF), leukemia inhibitory factor (LIF), Flt-3 ligand (Flt- 

25 3L), any of the interleukins, recombinant soluble 1L-6 receptor fused to IL-6, macrophage 

inflammatory protein 1-alpha (MIP-1 -alpha), G-CSF, GM-CSF, thrombopoietin (TPO), platelet 
factor 4 (PF-4), platelet-derived growth factor (PDGF), neural growth factors and basic 

fibroblast growth factor (bFGF). 

Since totipotent stem cells can give rise to virtually any mature cell type, expansion of 

30 these cells in culture will facilitate the production of large quantities of mature cells. Techniques 
for culturing stem cells are known in the art and administration of polypeptides of the invention, 
optionally with other growth factors and/or cytokines, is expected to enhance the survival and 
proliferation of the stem cell populations. This can be accomplished by direct administration of 
the polypeptide of the invention to the culture medium. Alternatively, stroma cells transfected 

35 with a polynucleotide that encodes for the polypeptide of the invention can be used as a feeder 
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layer for the stem cell populations in culture or in vivo. Stromal support cells for feeder layers 
may include embryonic bone marrow fibroblasts, bone marrow stromal cells, fetal liver cells, or 
cultured embryonic fibroblasts (see U.S. Patent No. 5,690,926). 

Stem cells themselves can be transfected with a polynucleotide of the invention to induce 
5 autocrine expression of the polypeptide of the invention. This will allow for generation of 

undifferentiated totipotential/pluripotential stem cell lines that are useful as is or that can then be 
differentiated into the desired mature cell types. These stable cell lines can also serve as a source 
of undifferentiated totipotential/pluripotential mRNA to create cDNA libraries and templates for 
polymerase chain reaction experiments. These studies would allow for the isolation and 
10 identification of differentially expressed genes in stem cell populations that regulate stem cell 
proliferation and/or maintenance. 

Expansion and maintenance of totipotent stem cell populations will be useful in the 
treatment of many pathological conditions. For example, polypeptides of the present invention 
may be used to manipulate stem cells in culture to give rise to neuroepithelial cells that can be 
1 5 used to augment or replace cells damaged by illness, autoimmune disease, accidental damage or 
genetic disorders. The polypeptide of the invention may be useful for inducing the proliferation 
of neural cells and for the regeneration of nerve and brain tissue, i.e. for the treatment of central 
and peripheral nervous system diseases and neuropathies, as well as mechanical and traumatic 
disorders which involve degeneration, death or trauma to neural cells or nerve tissue. In addition, 
20 the expanded stem cell populations can also be genetically altered for gene therapy purposes and 
to decrease host rejection of replacement tissues after grafting or implantation. 

Expression of the polypeptide of the invention and its effect on stem cells can also be 
manipulated to achieve controlled differentiation of the stem cells into more differentiated cell 
types. A broadly applicable method of obtaining pure populations of a specific differentiated 
25 cell type from undifferentiated stem cell populations involves the use of a cell-type specific 

promoter driving a selectable marker. The selectable marker allows only cells of the desired type 
to survive. For example, stem cells can be induced to differentiate into cardiomyocytes (Wobus 
et al. Differentiation, 48: 173-182, (1991); Klug et al., J. Clin. Invest., 98(1): 216-224, (1998)) 
or skeletal muscle cells (Browder, L. W. In: Principles of Tissue Engineering eds. Lanza et al., 
30 Academic Press (1997)). Alternatively, directed differentiation of stem cells can be 

accomplished by culturing the stem cells in the presence of a differentiation factor such as 
retinoic acid and an antagonist of the polypeptide of the invention which would inhibit the 
effects of endogenous stem cell factor activity and allow differentiation to proceed. 

In vitro cultures of stem cells can be used to determine if the polypeptide of the invention 
35 exhibits stem cell growth factor activity. Stem cells are isolated from any one of various cell 
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sources (including hematopoietic stem cells and embryonic stem cells) and cultured on a feeder 
layer, as described by Thompson et al. Proc. Natl. Acad. Sci, U.S.A., 92: 7844-7848 (1995), in 
the presence of the polypeptide of the invention alone or in combination with other growth 
factors or cytokines. The ability of the polypeptide of the invention to induce stem cells 
5 proliferation is determined by colony formation on semi-solid support e.g. as described by 
Bernstein et al., Blood, 77: 2316-2321 (1991). 

4.10.5 HEMATOPOIESIS REGULATING ACTIVITY 

A polypeptide of the present invention may be involved in regulation of hematopoiesis 

1 0 and, consequently, in the treatment of myeloid or lymphoid cell disorders. Even marginal 

biological activity in support of colony forming cells or of factor-dependent cell lines indicates 
involvement in regulating hematopoiesis, e.g. in supporting the growth and proliferation of 
erythroid progenitor cells alone or in combination with other cytokines, thereby indicating 
utility, for example, in treating various anemias or for use in conjunction with 

1 5 irradiation/chemotherapy to stimulate the production of erythroid precursors and/or erythroid 
cells; in supporting the growth and proliferation of myeloid cells such as granulocytes and 
monocytes/macrophages (i.e., traditional CSF activity) useful, for example, in conjunction with 
chemotherapy to prevent or treat consequent myelo-suppression; in supporting the growth and 
proliferation of megakaryocytes and consequently of platelets thereby allowing prevention or 

20 treatment of various platelet disorders such as thrombocytopenia, and generally for use in place 
of or complimentary to platelet transfusions; and/or in supporting the growth and proliferation of 
hematopoietic stem cells which are capable of maturing to any and all of the above-mentioned 
hematopoietic cells and therefore find therapeutic utility in various stem cell disorders (such as 
those usually treated with transplantation, including, without limitation, aplastic anemia and 

25 paroxysmal nocturnal hemoglobinuria), as well as in repopulating the stem cell compartment 
post iiradiation/chemotherapy, either in-vivo or ex-vivo (i.e., in conjunction with bone marrow 
transplantation or with peripheral progenitor cell transplantation (homologous or heterologous)) 
as normal cells or genetically manipulated for gene therapy. 

Therapeutic compositions of the invention can be used in the following: 

30 Suitable assays for proliferation and differentiation of various hematopoietic lines are 

cited above. 

Assays for embryonic stem cell differentiation (which will identify, among others, 
proteins that influence embryonic differentiation hematopoiesis) include, without limitation, 
those described in: Johansson et al. Cellular Biology 15:141-151, 1995; Keller et al., Molecular 
35 and Cellular Biology 13:473-486, 1993; McClanahan et al., Blood 81:2903-2915, 1993. 
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Assays for stem cell survival and differentiation (which will identify, among others, 
proteins that regulate lympho-hematopoiesis) include, without limitation, those described in: 
Methvlcellulose colony forming assays, Freshney, M. G. In Culture of Hematopoietic Cells. R. I. 
Freshney, et al. eds. Vol pp. 265-268, Wiley-Liss, Inc., New York, N.Y. 1994; Hirayama et al., 
Proc Nad. Acad. Sci. USA 89:5907-5911, 1992; Primitive hematopoietic colony forming cells 
with high proliferative potential, McNiece, I. K. and Briddell, R. A. In Culture of Hematopoietic 
Cells. R. I. Freshney, et al. eds. Vol pp. 23-39, Wiley-Liss, Inc., New York, N.Y. 1 994; Neben et 
al., Experimental Hematology 22:353-359, 1994; Cobblestone area forming cell assay, 
Ploemacher, R. E. In Culture of Hematopcetic Cells. R. I. Freshney, et al. eds. Vol pp. 1-21, 
Wiley-Liss, Inc., New York, N.Y. 1994; Long term bone marrow cultures in the presence of 
stromal cells, Spooncer, E, Dexter, M. and Allen, T. In Culture of Hematopoietic Cells. R. I. 
Freshney, et al. eds. Vol pp. 163-179, Wiley-Liss, Inc., New York, N.Y. 1994; Long term culture 
initiating cell assay, Sutherland, H. J. In Culture of Hematopoietic Cells. R. I. Freshney, et al. 
eds. Vol pp. 139-162, Wiley-Liss, Inc., New York, N.Y. 1994. 
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4.10.6 TISSUE GROWTH ACTIVITY 

A polypeptide of the present invention also may be involved in bone, cartilage, tendon, 
ligament and/or nerve tissue growth or regeneration, as well as in wound healing and tissue 
repair and replacement, and in healing of bums, incisions and ulcers. 

A polypeptide of the present invention which induces cartilage and/or bone growth in 
circumstances where bone is not normally formed, has application in the healing of bone 
fractures and cartilage damage or defects in humans and other animals. Compositions of a 
polypeptide, antibody, binding partner, or other modulator of the invention may have 
prophylactic use in closed as well as open fracture reduction and also in the improved fixation of 
artificial joints. De novo bone formation induced by an osteogenic agent contributes to the 
repair of congenital, trauma induced, or oncologic resection induced craniofacial defects, and 
also is useful in cosmetic plastic surgery. 

A polypeptide of this invention may also be involved in attracting bone-forming cells, 
stimulating growth of bone-forming cells, or inducing differentiation of progenitors of 
30 bone-forming cells. Treatment of osteoporosis, osteoarthritis, bone degenerative disorders, or 
periodontal disease, such as through stimulation of bone and/or cartilage repair or by blocking 
inflammation or processes of tissue destruction (collagenase activity, osteoclast activity, etc.) 
mediated by inflammatory processes may also be possible using the composition of the 
invention. 



25 
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Another category of tissue regeneration activity that may involve the polypeptide of the 
present invention is tendon/ligament formation. Induction of tendon/ligament-like tissue or 
other tissue formation in circumstances where such tissue is not normally formed, has 
application in the healing of tendon or ligament tears, deformities and other tendon or ligament 
5 defects in humans and other animals. Such a preparation employing a tendon/ligament-like 
tissue inducing protein may have prophylactic use in preventing damage to tendon or ligament 
tissue, as well as use in the improved fixation of tendon or ligament to bone or other tissues, and 
in repairing defects to tendon or ligament tissue. De novo tendon/ligament-like tissue formation 
induced by a composition of the present invention contributes to the repair of congenital, trauma 
10 induced, or other tendon or ligament defects of other origin, and is also useful in cosmetic plastic 
surgery for attachment or repair of tendons or ligaments. The compositions of the present 
invention may provide environment to attract tendon- or ligament-forming cells, stimulate 
growth of tendon- or ligament-forming cells, induce differentiation of progenitors of tendon- or 
ligament-forming cells, or induce growth of tendon/ligament cells or progenitors ex vivo for 
15 return in vivo to effect tissue repair. The compositions of the invention may also be useful in the 
treatment of tendinitis, carpal tunnel syndrome and other tendon or ligament defects. The 
compositions may also include an appropriate matrix and/or sequestering agent as a carrier as is 
well known in the art. 

The compositions of the present invention may also be useful for proliferation of neural 
20 cells and for regeneration of nerve and brain tissue, i.e. for the treatment of central and peripheral 
nervous system diseases and neuropathies, as well as mechanical and traumatic disorders, which 
involve degeneration, death or trauma to neural cells or nerve tissue. More specifically, a 
composition may be used in the treatment of diseases of the peripheral nervous system, such as 
peripheral nerve injuries, peripheral neuropathy and localized neuropathies, and central nervous 
25 system diseases, such as Alzheimer's, Parkinson's disease, Huntington's disease, amyotrophic 
lateral sclerosis, and Shy-Drager syndrome. Further conditions which may be treated in 
accordance with the present invention include mechanical and traumatic disorders, such as spinal 
cord disorders, head trauma and cerebrovascular diseases such as stroke. Peripheral neuropathies 
resulting from chemotherapy or other medical therapies may also be treatable using a 

30 composition of the invention. 

Compositions of the invention may also be useful to promote better or faster closure of 
non-healing wounds, including without limitation pressure ulcers, ulcers associated with vascular 
insufficiency, surgical and traumatic wounds, and the like. 

Compositions of the present invention may also be involved in the generation or 
3 5 regeneration of other tissues, such as organs (including, for example, pancreas, liver, intestine, 
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kidney, skin, endothelium), muscle (smooth, skeletal or cardiac) and vascular (including vascular 
endothelium) tissue, or for promoting the growth of cells comprising such tissues. Part of the 
desired effects may be by inhibition or modulation of fibrotic scarring may allow normal tissue 
to regenerate. A polypeptide of the present invention may also exhibit angiogenic activity. 
5 A composition of the present invention may also be useful for gut protection or 

regeneration and treatment of lung or liver fibrosis, reperfusion injury in various tissues, and 
conditions resulting from systemic cytokine damage. 

A composition of the present invention may also be useful for promoting or inhibiting 
differentiation of tissues described above from precursor tissues or cells; or for inhibiting the 
1 0 growth of tissues described above. 

Therapeutic compositions of the invention can be used in the following: 
Assays for tissue generation activity include, without limitation, those described in: 
International Patent Publication No. WO95/16035 (bone, cartilage, tendon); International Patent 
Publication No. WO95/05846 (nerve, neuronal); International Patent Publication No. 
1 5 W09 1/0749 1 (skin, endothelium). 

Assays for wound healing activity include, without limitation, those described in: Winter, 
Epidermal Wound Healing, pps. 71-112 (Maibach, H. I. and Rovee, D. T., eds.), Year Book 
Medical Publishers, Inc., Chicago, as modified by Eaglstein and Mertz, J. Invest. Dermatol 
71:382-84 (1978). 

20 

4.10.7 IMMUNE STIMULATING OR SUPPRESSING ACTIVITY 

A polypeptide of the present invention may also exhibit immune stimulating or immune 
suppressing activity, including without limitation the activities for which assays are described 
herein. A polynucleotide of the invention can encode a polypeptide exhibiting such activities. A 

25 protein may be useful in the treatment of various immune deficiencies and disorders (including 
severe combined immunodeficiency (SCID)), e.g., in regulating (up or down) growth and 
proliferation of T and/or B lymphocytes, as well as effecting the cytolytic activity of NK cells 
and other cell populations. These immune deficiencies may be genetic or be caused by viral {e.g., 
HIV) as well as bacterial or fungal infections, or may result from autoimmune disorders. More 

30 specifically, infectious diseases causes by viral, bacterial, fungal or other infection may be 

treatable using a protein of the present invention, including infections by HIV, hepatitis viruses, 
herpes viruses, mycobacteria, Leishmania spp., malaria spp. and various fungal infections such 
as candidiasis. Of course, in this regard, proteins of the present invention may also be useful 
where a boost to the immune system generally may be desirable, i.e., in the treatment of cancer. 
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Autoimmune disorders which may be treated using a protein of the present invention 
include, for example, connective tissue disease, multiple sclerosis, systemic lupus erythematosus, 
rheumatoid arthritis, autoimmune pulmonary inflammation, GuiUain-Barre syndrome, 
autoimmune thyroiditis, insulin dependent diabetes mellitis, myasthenia gravis, graft-versus-host 
5 disease and autoimmune inflammatory eye disease. Such a protein (or antagonists thereof, 
including antibodies) of the present invention may also to be useful in the treatment of allergic 
reactions and conditions (e.g., anaphylaxis, serum sickness, drug reactions, food allergies, insect 
venom allergies, mastocytosis, allergic rhinitis, hypersensitivity pneumonitis, urticaria, 
angioederaa, eczema, atopic dermatitis, allergic contact dermatitis, erythema multiforme, 
10 Stevens-Johnson syndrome, allergic conjunctivitis, atopic keratoconjunctivitis, venereal 
keratoconjunctivitis, giant papillary conjunctivitis and contact allergies), such as asthma 
(particularly allergic asthma) or other respiratory problems. Other conditions, in which immune 
suppression is desired (including, for example, organ transplantation), may also be treatable 
using a protein (or antagonists thereof) of the present invention. The therapeutic effects of the 
15 polypeptides or antagonists thereof on allergic reactions can be evaluated by in vivo animals 

models such as the cumulative contact enhancement test (Lastbom et al., Toxicology 125: 59-66, 
1998), skin prick test (Hoffmann et al., Allergy 54: 446-54, 1999), guinea pig skin sensitization 
test (Vohr et al., Arch. Toxocol. 73: 501-9), and murine local lymph node assay (Kimber et al., 
J. Toxicol. Environ. Health 53: 563-79). 
20 Using the proteins of the invention it may also be possible to modulate immune 

responses, in a number of ways. Down regulation may be in the form of inhibiting or blocking an 
immune response already in progress or may involve preventing the induction of an immune 
response. The functions of activated T cells may be inhibited by suppressing T cell responses or 
by inducing specific tolerance in T cells, or both. Immunosuppression of T cell responses is 
25 generally an active, non-antigen-specific, process which requires continuous exposure of the T 
cells to the suppressive agent. Tolerance, which involves inducing non-responsiveness or anergy 
in T cells, is distinguishable from immunosuppression in that it is generally antigen-specific and 
persists after exposure to the tolerizing agent has ceased. Operationally, tolerance can be 
demonstrated by the lack of a T cell response upon reexposure to specific antigen in the absence 

30 of the tolerizing agent. 

Down regulating or preventing one or more antigen functions (including without 
limitation B lymphocyte antigen functions (such as, for example, B7)), e.g., preventing high 
level lymphokine synthesis by activated T cells, will be useful in situations of tissue, skin and 
organ transplantation and in graft-versus-host disease (GVHD). For example, blockage of T cell 

35 function should result in reduced tissue destruction in tissue transplantation. Typically, in tissue 
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transplants, rejection of the transplant is initiated through its recognition as foreign by T cells, 
followed by an immune reaction that destroys the transplant. The administration of a therapeutic 
composition of the invention may prevent cytokine synthesis by immune cells, such as T cells, 
and thus acts as an immunosuppressant. Moreover, a lack of costimulation may also be sufficient 
to anergize the T cells, thereby inducing tolerance in a subject. Induction of long-term tolerance 
by B lymphocyte antigen-blocking reagents may avoid the necessity of repeated administration 
of these blocking reagents. To achieve sufficient immunosuppression or tolerance in a subject, it 
may also be necessary to block the function of a combination of B lymphocyte antigens. 

The efficacy of particular therapeutic compositions in preventing organ transplant 
rejection or GVHD can be assessed using animal models that are predictive of efficacy in 
humans. Examples of appropriate systems which can be used include allogeneic cardiac grafts in 
rats and xenogeneic pancreatic islet cell grafts in mice, both of which have been used to examine 
the immunosuppressive effects of CTLA4Ig fusion proteins in vivo as described in Lenschow et 
al, Science 257:789-792 (1992) and Turka et al., Proc. Natl. Acad. Sci USA, 89:1 1 102-1 1 105 
15 (1992). In addition, murine models of GVHD (see Paul ed., Fundamental Immunology, Raven 
Press, New York, 1989, pp. 846-847) can be used to determine the effect of therapeutic 
compositions of the invention on the development of that disease. 

Blocking antigen function may also be therapeutically useful for treating autoimmune 
diseases. Many autoimmune disorders are the result of inappropriate activation of T cells that are 
20 reactive against self tissue and which promote the production of cytokines and autoantibodies 
involved in the pathology of the diseases. Preventing the activation of autoreactive T cells may 
reduce or eliminate disease symptoms. Administration of reagents which block stimulation of T 
cells can be used to inhibit T cell activation and prevent production of autoantibodies or T 
cell-derived cytokines which may be involved in the disease process. Additionally, blocking 
25 reagents may induce antigen-specific tolerance of autoreactive T cells which could lead to 

long-term relief from the disease. The efficacy of blocking reagents in preventing or alleviating 
autoimmune disorders can be determined using a number of well-characterized animal models of 
human autoimmune diseases. Examples include murine experimental autoimmune encephalitis, 
systemic lupus erythmatosis in MRL/lpr/lpr mice or NZB hybrid mice, murine autoimmune 
30 collagen arthritis, diabetes mellitus in NOD mice and BB rats, and murine experimental 

myasthenia gravis (see Paul ed., Fundamental Immunology, Raven Press, New York, 1989, pp. 
840-856). 

Upregulation of an antigen function (e.g., a B lymphocyte antigen ftinction), as a means 
of up regulating immune responses, may also be useful in therapy. Upregulation of immune 
35 responses may be in the form of enhancing an existing immune response or eliciting an initial 



48 



WO 01/57188 



PCTAJS01/03800 



immune response. For example, enhancing an immune response may be useful in cases of viral 
infection, including systemic viral diseases such as influenza, the common cold, and 
encephalitis. 

Alternatively, anti-viral immune responses may be enhanced in an infected patient by 

5 removing T cells from the patient, costimulating the T cells in vitro with viral antigen-pulsed 
APCs either expressing a peptide of the present invention or together with a stimulatory form of 
a soluble peptide of the present invention and reintroducing the in vitro activated T cells into the 
patient. Another method of enhancing anti-viral immune responses would be to isolate infected 
cells from a patient, transfect them with a nucleic acid encoding a protein of the present 

1 0 invention as described herein such that the cells express all or a portion of the protein on then- 
surface, and reintroduce the transfected cells into the patient. The infected cells would now be 
capable of delivering a costimulatory signal to, and thereby activate, T cells in vivo. 

A polypeptide of the present invention may provide the necessary stimulation signal to T 
cells to induce a T cell mediated immune response against the transfected tumor cells. In 

1 5 addition, tumor cells which lack MHC class I or MHC class II molecules, or which fail to 

reexpress sufficient mounts of MHC class I or MHC class II molecules, can be transfected with 
nucleic acid encoding all or a portion of {e.g., a cytoplasmic-domain truncated portion) of an 
• MHC class I alpha chain protein and p 2 microglobulin protein or an MHC class II alpha chain 
protein and an MHC class II beta chain protein to thereby express MHC class I or MHC class II 

20 proteins on the cell surface. Expression of the appropriate class I or class II MHC in conjunction 
with a peptide having the activity of a B lymphocyte antigen (e.g., B7-1, B7-2, B7-3) induces a 1 
cell mediated immune response against the transfected tumor cell. Optionally, a gene encoding 
an antisense construct which blocks expression of an MHC class II associated protein, such as 
the invariant chain, can also be cotransfected with a DNA encoding a peptide having the activity 

25 of a B lymphocyte antigen to promote presentation of tumor associated antigens and induce 

tumor specific immunity. Thus, the induction of a T cell mediated immune response in a human 
subject may be sufficient to overcome tumor-specific tolerance in the subject. 

The activity of a protein of the invention may, among other means, be measured by the 
following methods: 

30 Suitable assays for thymocyte or splenocyte cytotoxicity include, without limitation, 

those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. 
H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and 
Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; 
Chapter 7, Immunologic studies in Humans); Herrmann et al., Proc. Natl. Acad. Sci. USA 

35 78:2488-2492, 1981; Herrmann et al., J. Immunol. 128:1968-1974, 1982; Handa et al., J. 
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Immunol. 135:1564-1572, 1985; Takai et al., I. Immunol. 137:3494-3500, 1986; Takai et al., J. 
Immunol. 140:508-512, 1988; Bowman et al., J. Virology 61:1992-1998; Bertagnolli et al., 
Cellular Immunology 133:327-341, 1991; Brown et al., J. Immunol. 153:3079-3092, 1994. 

Assays for T-cell-dependent immunoglobulin responses and isotype switching (which 
5 will identify, among others, proteins that modulate T-cell dependent antibody responses and that 
affect Thl/Th2 profiles) include, without limitation, those described in: Maliszewski, J. 
Immunol. 144:3028-3033, 1990; and Assays for B cell function: In vitro antibody production, 
Mond, J. J. and Brunswick, M. In Current Protocols in Immunology. J. E. e.a. Coligan eds. Vol 1 
pp. 3.8.1-3.8.16, John Wiley and Sons, Toronto. 1994. 
10 Mixed lymphocyte reaction (MLR) assays (which will identify, among others, proteins 

that generate predominantly Thl and CTL responses) include, without limitation, those described 
in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. 
M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 
In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in 
15 Humans); Takai etal., J. Immunol. 137:3494-3500, 1986; Takai etal., J. Immunol. 140:508-512, 
1988; Bertagnolli et al., J. Immunol. 149:3778-3783, 1992. 

Dendritic cell-dependent assays (which will identify, among others, proteins expressed 
by dendritic cells that activate naive T-cells) include, without limitation, those described in: 
Query et al., J. Immunol. 134:536-544, 1995; Inaba et al., Journal of Experimental Medicine 
20 173:549-559, 1991; Macatonia et al., Journal of Immunology 154:5071-5079, 1995; Porgador et 
al., Journal of Experimental Medicine 182:255-260, 1995; Nair et al., Journal of Virology 
67:4062-4069, 1993; Huang et al., Science 264:961-965, 1994; Macatonia et al., Journal of 
Experimental Medicine 169:1255-1264, 1989; Bhardwaj et al., Journal of Clinical Investigation 
94:797-807, 1994; and Inaba et al., Journal of Experimental Medicine 172:631-640, 1990. 
25 Assays for lymphocyte survival/apoptosis (which will identify, among others, proteins 

that prevent apoptosis after superantigen induction and proteins that regulate lymphocyte 
homeostasis) include, without limitation, those described in: Darzynkiewicz et al., Cytometry 
13:795-808, 1992; Gorczyca et al., Leukemia 7:659-670, 1993; Gorczyca et al., Cancer Research 
53:1945-1951, 1993; Itoh et al., Cell 66:233-243, 1991; Zacharchuk, Journal of Immunology 
30 145:4037-4045, 1990; Zamai et al., Cytometry 14:891-897, 1993; Gorczyca et al., International 
Journal of Oncology 1:639-648, 1992. 

Assays for proteins that influence early steps of T^ell commitment and development 
include, without limitation, those described in: Antica et al., Blood 84:1 1 1-1 17, 1994; Fine et al., 
Cellular Immunology 155:1 1 1-122, 1994; Galy et al., Blood 85:2770-2778, 1995; Toki et al., 
35 Proc. Nat. Acad Sci. USA 88:7548-7551, 1991. 
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4.10.8 ACTIVTN/INHIBIN ACTIVITY 

A polypeptide of the present invention may also exhibit activin- or inhibin-related 
activities. A polynucleotide of the invention may encode a polypeptide exhibiting such 

5 characteristics. Inhibins are characterized by their ability to inhibit the release of follicle 

stimulating hormone (FSH), while activins and are characterized by their ability to stimulate the 
release of follicle stimulating hormone (FSH). Thus, a polypeptide of the present invention, 
alone or in heterodimers with a member of the inhibin family, may be useful as a contraceptive 
based on the ability of inhibins to decrease fertility in female mammals and decrease 

1 0 spermatogenesis in male mammals. Administration of sufficient amounts of other inhibins can 
induce infertility in these mammals. Alternatively, the polypeptide of the invention, as a 
homodimer or as a heterodimer with other protein subunits of the inhibin group, may be useful as 
a fertility inducing therapeutic, based upon the ability of activin molecules in stimulating FSH 
release from cells of the anterior pituitary. See, for example, U.S. Pat. No. 4,798,885. A 

1 5 polypeptide of the invention may also be useful for advancement of the onset of fertility in 

sexually immature mammals, so as to increase the lifetime reproductive performance of domestic 
animals such as, but not limited to, cows, sheep and pigs. 

The activity of a polypeptide of the invention may, among other means, be measured by 
the following methods. 

20 Assays for activin/inhibin activity include, without limitation, those described in: Vale et 

al., Endocrinology 91:562-572, 1972; Ling et al., Nature 321:779-782, 1986; Vale et al., Nature 
321:776-779, 1986; Mason et al., Nature 318:659-663, 1985; Forage et al., Proc. Natl. Acad. Sci. 
USA 83:3091-3095, 1986. 

25 4.10.9 CHEMOTACTIC/CHEMOKINETIC ACTIVITY 

A polypeptide of the present invention may be involved in chemotactic or chemokinetic 
activity for mammalian cells, including, for example, monocytes, fibroblasts, neutrophils, 
T-cells, mast cells, eosinophils, epithelial and/or endothelial cells. A polynucleotide of the 
invention can encode a polypeptide exhibiting such attributes. Chemotactic and chemokinetic 

30 receptor activation can be used to mobilize or attract a desired cell population to a desired site of 
action. Chemotactic or chemokinetic compositions (e.g. proteins, antibodies, binding partners, or 
modulators of the invention) provide particular advantages in treatment of wounds and other 
trauma to tissues, as well as in treatment of localized infections. For example, attraction of 
lymphocytes, monocytes or neutrophils to tumors or sites of infection may result in improved 

35 immune responses against the tumor or infecting agent. 
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A protein or peptide has chemotactic activity for a particular cell population if it can 
stimulate, directly or indirectly, the directed orientation or movement of such cell population. 
Preferably, the protein or peptide has the ability to directly stimulate directed movement of cells. 
Whether a particular protein has chemotactic activity for a population of cells can be readily 
determined by employing such protein or peptide in any known assay for cell chemotaxis. 
Therapeutic compositions of the invention can be used in the following: 
Assays for chemotactic activity (which will identify proteins that induce or prevent 
chemotaxis) consist of assays that measure the ability of a protein to induce the migration of 
cells across a membrane as well as the ability of a protein to induce the adhesion of one cell 
population to another cell population. Suitable assays for movement and adhesion include, 
without limitation, those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. 
M. Kruisbeek, D. H. Marguiles, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates 
and Wiley-Interscience (Chapter 6.12, Measurement of alpha and beta Chemokines 
6.12.1-6.12.28; Taub et al. J. Clin. Invest. 95:1370-1376, 1995; Lind et al. APMIS 103:140-146, 
1995; Muller et al Eur. J. Immunol. 25 :1744-1748; Gruber et al. J. of Immunol. 152:5860-5867, 
1994; Johnston et al. J. of Immunol. 153:1762-1768, 1994. 

4.10.10 HEMOSTATIC AND THROMBOLYTIC ACTIVITY 

A polypeptide of the invention may also be involved in hemostatic or thrombolysis or 
thrombosis. A polynucleotide of the invention can encode a polypeptide exhibiting such 
attributes. Compositions may be useful in treatment of various coagulation disorders (including 
hereditary disorders, such as hemophilias) or to enhance coagulation and other hemostatic events 
in treating wounds resulting from trauma, surgery or other causes. A composition of the 
invention may also be useful for dissolving or inhibiting formation of thromboses and for 
treatment and prevention of conditions resulting therefrom (such as, for example, infarction of 
cardiac and central nervous system vessels (e.g., stroke). 

Therapeutic compositions of the invention can be used in the following: 
Assay for hemostatic and thrombolytic activity include, without limitation, those 
described in: Linet et al., J. Clin. Pharmacol. 26:131-140, 1986; Burdick et al., Thrombosis Res. 
45:413-419, 1987; Humphrey et al., Fibrinolysis 5:71-79 (1991); Schaub, Prostaglandins 
35:467-474, 1988. 

4.10.11 CANCER DIAGNOSIS AND THERAPY 

Polypeptides of the invention may be involved in cancer cell generation, proliferation or 
metastasis. Detection of the presence or amount of polynucleotides or polypeptides of the 
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invention mav be useful for the diagnosis and/or prognosis of one or more types of cancer. For 
example, the presence or increased expression of a polynucleotide/polypeptide of the invention 
may indicate a hereditary risk of cancer, a precancerous condition, or an ongoing malignancy . 
Conversely, a defect in the gene or absence of the polypeptide may be associated with a cancer 
condition. Identification of single nucleotide polymorphisms associated with cancer or a 
predisposition to cancer may also be useful for diagnosis or prognosis. 

Cancer treatments promote tumor regression by inhibiting tumor cell proliferation, 
inhibiting angiogenesis (growth of new blood vessels that is necessary to support tumor growth) 
and/or prohibiting metastasis by reducing tumor cell motility or invasiveness. Therapeutic 
compositions of the invention may be effective in adult and pediatric oncology including in solid 
phase tumors/malignancies, locally advanced tumors, human soft tissue sarcomas, metastatic 
cancer, including lymphatic metastases, blood cell malignancies including multiple myeloma, 
acute and chronic leukemias, and lymphomas, head and neck cancers including mouth cancer, 
larynx cancer and thyroid cancer, lung cancers including small cell carcinoma and non-small cell 
cancers, breast cancers including small cell carcinoma and ductal carcinoma, gastrointestinal 
cancers including esophageal cancer, stomach cancer, colon cancer, colorectal cancer and polyps 
associated with colorectal neoplasia, pancreatic cancers, liver cancer, urologic cancers including 
bladder cancer and prostate cancer, malignancies of the female genital tract including ovarian 
carcinoma, uterine (including endometrial) cancers, and solid tumor in the ovarian follicle, 
kidney cancers including renal cell carcinoma, brain cancers including intrinsic brain tumors, 
neuroblastoma, astrocytic brain tumors, gliomas, metastatic tumor cell invasion in the central 
nervous system, bone cancers including osteomas, skin cancers including malignant melanoma, 
tumor progression of human skin keratinocytes, squamous cell carcinoma, basal cell carcinoma, 
hemangiopericytoma and Karposi's sarcoma. 

Polypeptides, polynucleotides, or modulators of polypeptides of the invention (including 
inhibitors and stimulators of the biological activity of the polypeptide of the invention) may be 
administered to treat cancer. Therapeutic compositions can be administered in therapeutically 
effective dosages alone or in combination with adjuvant cancer therapy such as surgery, 
chemotherapy, radiotherapy, thermotherapy, and laser therapy, and may provide a beneficial 
effect, e.g. reducing tumor size, slowing rate of tumor growth, inhibiting metastasis, or otherwise 
improving overall clinical condition, without necessarily eradicating the cancer. 

The composition can also be administered in therapeutically effective amounts as a 
portion of an anti-cancer cocktail. An anti-cancer cocktail is a mixture of the polypeptide or 
modulator of the invention with one or more anti-cancer drugs in addition to a pharmaceutically 
acceptable carrier for delivery. The use of anti-cancer cocktails as a cancer treatment is routine. 
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Anti-cancer drugs that are well known in the art and can be used as a treatment in combination 
with the polypeptide or modulator of the invention include: Actinomycin D, Aimnoglutethimide, 
Asparaginase, Bleomycin, Busulfan, Carboplatin, Carmustine, Chlorambucil, Cisplatin (cis- 
DDP), Cyclophosphamide, Cytarabine HCl (Cytosine arabinoside), Dacarbazine, Dactinomycin, 
Daunorubicin HCl, Doxorubicin HCl, Estramustine phosphate sodium, Etoposide (V16-213), 
Floxuridine, 5-Fluorouracil (5-Fu), Flutamide, Hydroxyurea (hydroxycarbamide), Ifosfamide, 
Interferon Alpha-2a, Interferon Alpha-2b, Leuprolide acetate (LHRH-releasing factor analog), 
Lomustine, Mecmoremamine HCl (nitrogen'mustard), Melphalan, Mercaptopurine, Mesna, 
Methotrexate (MTX), Mitomycin, Mitoxantrone HCl, Octreotide, Plicamycin, Procarbazine HCl, 
Streptozocin, Tamoxifen citrate, Thioguanine, Thiotepa, Vinblastine sulfate, Vincristine sulfate, 
Amsacrine, Azacitidine, Hexamethylmelamine, Interleukin-2, Mitoguazone, Pentostatin, 
Semustine, Teniposide, and Vindesine sulfate. 

In addition, therapeutic compositions of the invention may be used for prophylactic 
treatment of cancer. There are hereditary conditions and/or environmental situations {e.g. 
exposure to carcinogens) known in the art that predispose an individual to developing cancers. 
Under these circumstances, it may be beneficial to treat these individuals with therapeutically 
effective doses of the polypeptide of the invention to reduce the risk of developing cancers. 

In vitro models can be used to determine the effective doses of the polypeptide of the 
invention as a potential cancer treatment. These in vitro models include proliferation assays of 
cultured tumor cells, growth of cultured tumor cells in soft agar (see Freshney, (1987) Culture of 
Animal Cells: A Manual of Basic Technique, Wily-Liss, New York, NY Ch 1 8 and Ch 21), 
tumor systems in nude mice as described in Giovanella et al., J. Natl. Can. Inst., 52: 921-30 
(1974), mobility and invasive potential of tumor cells in Boyden Chamber assays as described in 
Pilkington et al., Anticancer Res., 17: 4107-9 (1997), and angiogenesis assays such as induction 
of vascularization of the chick chorioallantoic membrane or induction of vascular endothelial 
cell migration as described in Ribatta et al., Intl. J. Dev. Biol., 40: 1 189-97 (1999) and Li et al., 
Clin. Exp. Metastasis, 17:423-9 (1999), respectively. Suitable tumor cells lines are available, 
e.g. from American Type Tissue Culture Collection catalogs. 

4.10.12 RECEPTOR/LIGAND ACTIVITY 

A polypeptide of the present invention may also demonstrate activity as receptor, 
receptor ligand or inhibitor or agonist of receptor/ligand interactions. A polynucleotide of the 
invention can encode a polypeptide exhibiting such characteristics. Examples of such receptors 
and ligands include, without limitation, cytokine receptors and their ligands, receptor kinases and 
their ligands, receptor phosphatases and their ligands, receptors involved in cell-cell interactions 
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and their ligands (including without limitation, cellular adhesion molecules (such as selectins, 
integrins and their ligands) and receptor/ligand pairs involved in antigen presentation, antigen 
recognition and development of cellular and humoral immune responses. Receptors and ligands 
are also useful for screening of potential peptide or small molecule inhibitors of the relevant 
receptor/ligand interaction. A protein of the present invention (including, without limitation, 
fragments of receptors and ligands) may themselves be useful as inhibitors of receptor/ligand 
interactions. 

The activity of a polypeptide of the invention may, among other means, be measured by 

the following methods: 

Suitable assays for receptor-ligand activity include without limitation those described in: 
Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. 
Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley- Interscience (Chapter 7.28, 
Measurement of Cellular Adhesion under static conditions 7.28.1- 7.28.22), Takai et al., Proc. 
Natl. Acad. Sci. USA 84:6864-6868, 1987; Bierer et al., J. Exp. Med. 168:1 145-1 156, 1988; 
Rosenstein et al., J. Exp. Med. 169:149-160 1989; Stoltenborg et al., J. Immunol. Methods 
175:59-68, 1994; Stitt et al., Cell 80:661-670, 1995. 

By way of example, the polypeptides of the invention may be used as a receptor for a 
ligand(s) thereby transmitting the biological activity of that ligand(s). Ligands may be identified 
through binding assays, affinity chromatography, dihybrid screening assays, BIAcore assays, gel 
overlay assays, or other methods known in the art. 

Studies characterizing drugs or proteins as agonist or antagonist or partial agonists or a 
partial antagonist require the use of other proteins as competing ligands. The polypeptides of the 
present invention or ligand(s) thereof may be labeled by being coupled to radioisotopes, 
colorimetric molecules or a toxin molecules by conventional methods. ("Guide to Protein 
Purification" Murray P. Deutscher (ed) Methods in Enzymology Vol. 182 (1990) Academic 
Press, Inc. San Diego). Examples of radioisotopes include, but are not limited to, tritium and 
carbon-14 . Examples of colorimetric molecules include, but are not limited to, fluorescent 
molecules such as fluorescamine, or rhodamine or other colorimetric molecules. Examples of 
toxins include, but are not limited, to ricin. 

4.10.13 DRUG SCREENING 

This invention is particularly useful for screening chemical compounds by using the 
novel polypeptides or binding fragments thereof in any of a variety of drug screening techniques. 
The polypeptides or fragments employed in such a test may either be free in solution, affixed to a 
solid support, borne on a cell surface or located intracellularly. One method of drug screening 
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utilizes eukaryotic or prokaryotic host cells which are stably transformed with recombinant 
nucleic acids expressing the polypeptide or a fragment thereof. Drugs are screened against such 
transformed cells in competitive binding assays. Such cells, either in viable or fixed form, can 
be used for standard binding assays. One may measure, for example, the formation of 
5 complexes between polypeptides of the invention or fragments and the agent being tested or 
examine the diminution in complex formation between the novel polypeptides and an 
appropriate cell line, which are well known in the art. 

Sources for test compounds that may be screened for ability to bind to or modulate (Le. , 
increase or decrease) the activity of polypeptides of the invention include (1) inorganic and 
10 organic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 
comprised of either random or mimetic peptides, oligonucleotides or organic molecules. 

Chemical libraries may be readily synthesized or purchased from a number of 
commercial sources, and may include structural analogs of known compounds or compounds 
that are identified as "hits" or "leads" via natural product screening. 
15 The sources of natural product libraries are microorganisms (including bacteria and 

fungi), animals, plants or other vegetation, or marine organisms, and libraries of mixtures for 
screening may be created by: (1) fermentation and extraction of broths from soil, plant or marine 
microorganisms or (2) extraction of the organisms themselves. Natural product libraries include 
polyketides, non-ribosomal peptides, and (non-naturaUy occurring) variants thereof. For a 
20 review, see Science 252:63-68 (1998). 

Combinatorial libraries are composed of large numbers of peptides, oligonucleotides or 
organic compounds and can be readily prepared by traditional automated synthesis methods, 
PCR, cloning or proprietary synthetic methods. Of particular interest are peptide and 
oligonucleotide combinatorial libraries. Still other libraries of interest include peptide, protein, 
25 peptidomimetic, multiparallel synthetic collection, recombinatorial, and polypeptide libraries. 
For a review of combinatorial chemistry and libraries created therefrom, see Myers, Curr. Opin. 
Biotechnol. 8 :701-707 (1997). For reviews and examples of peptidomimetic libraries, see 
Al-Obeidi et al., Mol Biotechnol 9(3):205-23 (1998); Hruby et al., Curr Opin Chem Biol, 
1(1):1 14-19 (1997); Domer et al., Bioorg Med Chem, 4(5):709-15 (1996) (alkylated dipeptides). 
30 Identification of modulators through use of the various libraries described herein permits 

modification of the candidate "hit" (or "lead") to optimize the capacity of the "hit" to bind a 
polypeptide of the invention. The molecules identified in the binding assay are then tested for 
antagonist or agonist activity in in vivo tissue culture or animal models that are well known in the 
art. In brief, the molecules are titrated into a plurality of cell cultures or animals and then tested 
3 5 for either cell/animal death or prolonged survival of the animal/cells. 
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The binding molecules thus identified may be complexed with toxins, e.g., ricin or 
cholera, or with other compounds that axe toxic to cells such as radioisotopes. The toxin-binding 
molecule complex is then targeted to a tumor or other cell by the specificity of the binding 
molecule for a polypeptide of the invention. Alternatively, the binding molecules may be 
complexed with imaging agents for targeting and imaging purposes. 

4.10.14 ASSAY FOR RECEPTOR ACTIVITY 

The invention also provides methods to detect specific binding of a polypeptide e.g. a 
ligand or a receptor. The art provides numerous assays particularly useful for identifying 
previously unknown binding partners for receptor polypeptides of the invention. For example, 
expression cloning using mammalian or bacterial cells, or dihybrid screening assays can be used 
to identify polynucleotides encoding binding partners. As another example, affinity 
chromatography with the appropriate immobilized polypeptide of the invention can be used to 
isolate polypeptides that recognize and bind polypeptides of the invention. There are a number 
of different libraries used for the identification of compounds, and in particular small molecules, 
that modulate {i.e., increase or decrease) biological activity of a polypeptide of the invention. 
Ligands for receptor polypeptides of the invention can also be identified by adding exogenous 
hgands, or cocktails of ligands to two cells populations that are genetically identical except for 
the expression of the receptor of the invention: one cell population expresses the receptor of the 
invention whereas the other does not. The response of the two cell populations to the addition of 
ligands(s) are then compared. Alternatively, an expression library can be co-expressed with the 
polypeptide of the invention in cells and assayed for an autocrine response to identify potential 
ligand(s). As still another example, BIAcore assays, gel overlay assays, or other methods known 
in the art can be used to identify binding partner polypeptides, including, (1) organic and 
inorganic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 
comprised of random peptides, oligonucleotides or organic molecules. 

The role of downstream intracellular signaling molecules in the signaling cascade of the 
polypeptide of the invention can be determined. For example, a chimeric protein in which the 
cytoplasmic domain of the polypeptide of the invention is fused to the extracellular portion of a 
protein, whose ligand has been identified, is produced in a host cell. The cell is then incubated 
with the ligand specific for the extracellular portion of the chimeric protein, thereby activating 
the chimeric receptor. Known downstream proteins involved in intracellular signaling can then 
be assayed for expected modifications i.e. phosphorylation. Other methods known to those in the 
art can also be used to identify signaling molecules involved in receptor activity. 
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4.10.15 ANTI-INFLAMMATORY ACTIVITY 

Compositions of the present invention may also exhibit anti-inflammatory activity. The 
anti-inflammatory activity may be achieved by providing a stimulus to cells involved in the 
inflammatory response, by inhibiting or promoting cell-cell interactions (such as, for example, 
5 cell adhesion), by inhibiting or promoting chemotaxis of cells involved in the inflammatory 
process, inhibiting or promoting cell extravasation, or by stimulating or suppressing production 
of other factors which more directly inhibit or promote an inflammatory response. Compositions 
with such activities can be used to treat inflammatory conditions including chronic or acute 
conditions), including without limitation intimation associated with infection (such as septic 
10 shock, sepsis or systemic inflammatory response syndrome (SIRS)), ischemia-reperfusion injury, 
endotoxin lethality, arthritis, complement-mediated hyperacute rejection, nephritis, cytokine or 
chemokine-induced lung injury, inflammatory bowel disease, Crohn's disease or resulting from 
over production of cytokines such as TNF or IL-1 . Compositions of the invention may also be 
useful to treat anaphylaxis and hypersensitivity to an antigenic substance or material. 
1 5 Compositions of this invention may be utilized to prevent or treat conditions such as, but not 
limited to, sepsis, acute pancreatitis, endotoxin shock, cytokine induced shock, rheumatoid 
arthritis, chronic inflammatory arthritis, pancreatic cell damage from diabetes mellitus type 1, 
graft versus host disease, inflammatory bowel disease, inflamation associated with pulmonary 
disease, other autoimmune disease or inflammatory disease, an antiproliferative agent such as for 
20 acute or chronic mylegenous leukemia or in the prevention of premature labor secondary to 
intrauterine infections. 

4.10.16 LEUKEMIAS 

Leukemias and related disorders may be Ireated or prevented by administration of a 
25 therapeutic that promotes or inhibits function of the polynucleotides and/or polypeptides of the 
invention. Such leukemias and related disorders include but are not limited to acute leukemia, 
acute lymphocytic leukemia, acute myelocytic leukemia, myeloblast^, promyelocyte, 
myelomonocytic, monocytic, erythroleukemia, chronic leukemia, chronic myelocytic 
(granulocytic) leukemia and chronic lymphocytic leukemia (for a review of such disorders, see 
30 Fishman et al., 1 985, Medicine, 2d Ed., J.B. Lippincott Co., Philadelphia). 

4.10.17 NERVOUS SYSTEM DISORDERS 

Nervous system disorders, involving cell types which can be tested for efficacy of 
intervention with compounds that modulate the activity of the polynucleotides and/or 
35 polypeptides of the invention, and which can be treated upon thus observing an indication of 
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therapeutic utility, include but are not limited to nervous system injuries, and diseases or 
disorders which result in either a disconnection of axons, a diminution or degeneration of 
neurons, or demyelination. Nervous system lesions which may be treated m a patient (including 
human and non-human mammalian patients) according to the invention include but are not 
5 limited to the following lesions of either the central (including spinal cord, brain) or peripheral 
nervous systems: 

(i) traumatic lesions, including lesions caused by physical injury or associated with 
surgery, for example, lesions which sever a portion of the nervous system, or compression 
injuries; 

10 (ii) ischemic lesions, in which a lack of oxygen in a portion of the nervous system 

results in neuronal injury or death, including cerebral infarction or ischemia, or spinal cord 

infarction or ischemia; 

(iii) infectious lesions, in which a portion of the nervous system is destroyed or 
injured as a result of infection, for example, by an abscess or associated with infection by human 

15 immunodeficiency virus, herpes zoster, or herpes simplex virus or with Lyme disease, 

tuberculosis, syphilis; 

(iv) degenerative lesions, in which a portion of the nervous system is destroyed or 
injured as a result of a degenerative process including but not limited to degeneration associated 
with Parkinson's disease, Alzheimer's disease, Huntington's chorea, or amyotrophic lateral 

20 sclerosis; 

(v) lesions associated with nutritional diseases or disorders, in which a portion of the 
nervous system is destroyed or injured by a nutritional disorder or disorder of metabolism 
including but not limited to, vitamin B12 deficiency, folic acid deficiency, Wernicke disease, 
tobacco-alcohol amblyopia, Marchiafava-Bignami disease (primary degeneration of the corpus 

25 callosum), and alcoholic cerebellar degeneration; 

(vi) neurological lesions associated with systemic diseases including but not limited to 
diabetes (diabetic neuropathy, Bell's palsy), systemic lupus erythematosus, carcinoma, or 
sarcoidosis; 

(vii) lesions caused by toxic substances including alcohol, lead, or particular 

30 neurotoxins; and 

(viii) demyelinated lesions in which a portion of the nervous system is destroyed or 
injured by a demyelinating disease including but not limited to multiple sclerosis, human 
immunodeficiency virus-associated myelopathy, transverse myelopathy or various etiologies, 
progressive multifocal leukoencephalopathy, and central pontine myelolysis. 
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Therapeutics which are useful according to the invention for treatment of a nervous 
system disorder may be selected by testing for biological activity in promoting the survival or 
differentiation of neurons. For example, and not by way of limitation, therapeutics which elicit 
any of the following effects may be useful according to the invention: 
5 (i) increased survival time of neurons in culture; 

(ii) increased sprouting of neurons in culture or in vivo; 

(in) increased production of a neuron-associated molecule in culture or in vivo, e.g. , 
choline acetyltransferase or acetylcholinesterase with respect to motor neurons; or 
(iv) decreased symptoms of neuron dysfunction in vivo. 

1 0 Such effects may be measured by any method known in the art. In preferred, 

non-limiting embodiments, increased survival of neurons may be measured by the method set 
forth in Arakawa et al. (1990, J. Neurosci. 10:3507-3515); increased sprouting of neurons may 
be detected by methods set forth in Pestronk et al. (1980, Exp. Neurol. 70:65-82) or Brown et al. 
(1981, Ann. Rev. Neurosci. 4:17-42); increased production of neuron-associated molecules may 

1 5 be measured by bioassay, enzymatic assay, antibody binding, Northern blot assay, etc., 

depending on the molecule to be measured; and motor neuron dysfunction may be measured by 
assessing the physical manifestation of motor neuron disorder, e.g., weakness, motor neuron 
conduction velocity, or functional disability. 

In specific embodiments, motor neuron disorders that may be treated according to the 

20 invention include but are not limited to disorders such as infarction, infection, exposure to toxin, 
trauma, surgical damage, degenerative disease or malignancy that may affect motor neurons as 
well as other components of the nervous system, as well as disorders that selectively affect 
neurons such as amyotrophic lateral sclerosis, and including but not limited to progressive spinal 
muscular atrophy, progressive bulbar palsy, primary lateral sclerosis, infantile and juvenile 

25 muscular atrophy, progressive bulbar paralysis of childhood (Fazio-Londe syndrome), 
poliomyelitis and the post polio syndrome, and Hereditary Motorsensory Neuropathy 
(Charcot-Marie-Tooth Disease). 

4.10.18 OTHER ACTIVITIES 

30 A polypeptide of the invention may also exhibit one or more of the following additional 

activities or effects: inhibiting the growth, infection or function of, or killing, infectious agents, 
including, without limitation, bacteria, viruses, fungi and other parasites; effecting (suppressing 
or enhancing) bodily characteristics, including, without limitation, height, weight, hair color, eye 
color, skin, fat to lean ratio or other tissue pigmentation, or organ or body part size or shape 

3 5 (such as, for example, breast augmentation or diminution, change in bone form or shape); 
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effecting biorhythms or circadian cycles or rhythms; effecting the fertility of male or female 
subjects; effecting the metabolism, catabolxsm, anabolism, processing, utilization, storage or 
elimination of dietary- fat, lipid, protein, carbohydrate, vitamins, minerals, co-factors or other 
nutritional factors or component(s); effecting behavioral characteristics, including, without 

5 limitation, appetite, libido, stress, cognition (including cognitive disorders), depression 

(including depressive disorders) and violent behaviors; providing analgesic effects or other pain 
reducing effects; promoting differentiation and growth of embryonic stem cells in lineages other 
than hematopoietic lineages; hormonal or endocrine activity; in the case of enzymes, correcting 
deficiencies of the enzyme and treating deficiency-related diseases; treatment of 

10 hyperproliferative disorders (such as, for example, psoriasis); immunoglobulin-like activity (such 
as, for example, the ability to bind antigens or complement); and the ability to act as an antigen 
in a vaccine composition to raise an immune response against such protein or another material or 
entity which is cross-reactive with such protein. 

15 4 . 10 .19 IDENTIFICATION OF POLYMORPHISMS 

The demonstration of polymorphisms makes possible the identification of such 
polymorphisms in human subjects and the pharmacogenetics use of this information for diagnosis 
and treatment. Such polymorphisms may be associated with, e.g., differential predisposition or 
susceptibility to various disease states (such as disorders involving inflammation or immune 
20 response) or a differential response to drug administration, and this genetic information can be 
used to tailor preventive or therapeutic treatment appropriately. For example, the existence of a 
polymorphism associated with a predisposition to inflammation or autoimmune disease makes 
possible the diagnosis of this condition in humans by identifying the presence of the 
polymorphism. 

25 Polymorphisms can be identified in a variety of ways known in the art which all 

generally involve obtaining a sample from a patient, analyzing DNA from the sample, optionally 
involving isolation or amplification of the DNA, and identifying the presence of the 
polymorphism in the DNA. For example, PCR may be used to amplify an appropriate fragment 
of genomic DNA which may then be sequenced. Alternatively, the DNA may be subjected to 

30 allele-specific oligonucleotide hybridization (in which appropriate oligonucleotides are 

hybridized to the DNA under conditions permitting detection of a single base mismatch) or to a 
single nucleotide extension assay (in which an oligonucleotide that hybridizes immediately 
adjacent to the position of the polymorphism is extended with one or more labeled nucleotides). 
In addition, traditional restriction fragment length polymorphism analysis (using restriction 

35 enzymes that provide differential digestion of the genomic DNA depending on the presence or 
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absence of the polymorphism) may be performed. Arrays with nucleotide sequences of the 
present invention can be used to detect polymorphisms. The array can comprise modified 
nucleotide sequences of the present invention in order to detect the nucleotide sequences of the 
present invention. In the alternative, any one of the nucleotide sequences of the present 
invention can be placed on the array to detect changes from those sequences. 

Alternatively a polymorphism resulting in a change in the amino acid sequence could 
also be detected by detecting a corresponding change in amino acid sequence of the protein, e.g., 
by an antibody specific to the variant sequence. 

4.10.20 ARTHRITIS AND INFLAMMATION 

The immunosuppressive effects of the compositions of the invention against rheumatoid 
arthritis is determined in an experimental animal model system. The experimental model system 
is adjuvant induced arthritis in rats, and the protocol is described by J. Holoshitz, et at., 1983, 
Science, 219:56, or by B. Waksman et al., 1963, Int. Arch. Allergy Appl. Immunol., 23:129. 
Induction of the disease can be caused by a single injection, generally intradermally, of a 
suspension of killed Mycobacterium tuberculosis in complete Freund's adjuvant (CFA). The 
route of injection can vary, but rats may be injected at the base of the tail with an adjuvant 
mixture. The polypeptide is adrriinistered in phosphate buffered solution (PBS) at a dose of about 
1-5 mg/kg. The control consists of administering PBS only. 

The procedure for testing the effects of the test compound would consist of intradermally 
injecting killed Mycobacterium tuberculosis in CFA followed by immediately administering the 
test compound and subsequent treatment every other day until day 24. At 14, 15, 18, 20, 22, and 
24 days after injection of Mycobacterium CFA, an overall arthritis score may be obtained as 
described by J. Holoskitz above. An analysis of the data would reveal that the test compound 
would have a dramatic affect on the swelling of the joints as measured by a decrease of the 
arthritis score. 

4.11 THERAPEUTIC METHODS 

The compositions (including polypeptide fragments, analogs, variants and antibodies or 
other binding partners or modulators including antisense polynucleotides) of the invention have 
numerous applications in a variety of therapeutic methods. Examples of therapeutic applications 
include, but are not limited to, those exemplified herein. 

4.11.1 EXAMPLE 
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One embodiment of the invention is the administration of an effective amount of the 
polypeptides or other composition of the invention to individuals affected by a disease or 
disorder that can be modulated by regulating the peptides of the invention. While the mode of 
administration is not particularly important, parenteral administration is preferred. An 

5 exemplary mode of administration is to deliver an intravenous bolus. The dosage of the 
polypeptides or other composition of the invention will normally be determined by the 
prescribing physician. It is to be expected that the dosage will vary according to the age, weight, 
condition and response of the individual patient. Typically, the amount of polypeptide 
administered per dose will be in the range of about 0.01 ug/kg to 100 mg/kg of body weight, with 

1 0 the preferred dose being about 0.1 ug/kg to 1 0 mg/kg of patient body weight. For parenteral 

administration, polypeptides of the invention will be formulated in an injectable form combined 
with a pharmaceutically acceptable parenteral vehicle. Such vehicles are well known in the art 
and examples include water, saline, Ringer's solution, dextrose solution, and solutions consisting 
of small amounts of the human serum albumin. The vehicle may contain minor amounts of 

1 5 additives that maintain the isotonicity and stability of the polypeptide or other active ingredient. 
The preparation of such solutions is within the skill of the art. 

4.12 PHARMACEUTICAL FORMULATIONS AND ROUTES OF 
ADMINISTRATION 

20 A protein or other composition of the present invention (from whatever source derived, 

including without limitation from recombinant and non-recombinant sources and including 
antibodies and other binding partners of the polypeptides of the invention) may be administered 
to a patient in need, by itself, or in pharmaceutical compositions where it is mixed with suitable 
carriers or excipient(s) at doses to treat or ameliorate a variety of disorders. Such a composition 

25 may optionally contain (in addition to protein or other active ingredient and a carrier) diluents, 
fillers, salts, buffers, stabilizers, solubilizers, and other materials well known in the art. The term 
"pharmaceutically acceptable" means a non-toxic material that does not interfere with the 
effectiveness of the biological activity of the active ingredients). The characteristics of the 
carrier will depend on the route of administration. The pharmaceutical composition of the 

30 invention may also contain cytokines, lymphokines, or other hematopoietic factors such as 

M-CSF, GM-CSF, TNF, IL-1, IL-2, IL-3, IL-4, IL-5, IL-6, IL-7, IL-8, IL-9, IL-10, IL-1 1, IL-12, 
IL-1 3, IL-14, IL-1 5, IFN, TNF0, TNF1 , TNF2, G-CSF, Meg-CSF, thrombopoietin, stem cell 
factor, and erythropoietin. In further compositions, proteins of the invention may be combined 
with other agents beneficial to the treatment of the disease or disorder in question. These agents 

35 include various growth factors such as epidermal growth factor (EGF), platelet-derived growth 
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factor (PDGF), transforming growth factors (TGF-a and TGF-p), insulin-like growth factor 
(IGF), as well as cytokines described herein. 

The pharmaceutical composition may further contain other agents which either enhance 
the activity of the protein or other active ingredient or complement its activity or use in 
5 treatment. Such additional factors and/or agents may be included in the pharmaceutical 
composition to produce a synergistic effect with protein or other active ingredient of the 
invention, or to minimize side effects. Conversely, protein or other active ingredient of the 
present invention may be included in formulations of the particular clotting factor, cytokine, 
lymphokine, other hematopoietic factor, thrombolytic or anti-thrombotic factor, or and- 
1 0 inflammatory agent to ininimize side effects of the clotting factor, cytokine, lymphokine, other 
hematopoietic factor, thrombolytic or anti-thrombotic factor, or anti-inflammatory agent (such as 
IL-IRa, IL-1 Hyl, IL-1 Hy2, anti-TNF, corticosteroids, immunosuppressive agents). A protein 
of the present invention may be active in multimers (e.g., heterodimers or homodimers) or 
complexes with itself or other proteins. As a result, pharmaceutical compositions of the 
1 5 invention may comprise a protein of the invention in such multimeric or complexed form. 

As an alternative to being included in a pharmaceutical composition of the invention 
including a first protein, a second protein or a therapeutic agent may be concurrently 
administered with the first protein (e.g., at the same time, or at differing times provided that 
therapeutic concentrations of the combination of agents is achieved at the treatment site). 
20 Techniques for formulation and administration of the compounds of the instant application may 
be found in "Remington's Pharmaceutical Sciences," Mack Publishing Co., Easton, PA, latest 
edition. A therapeutically effective dose further refers to that amount of the compound sufficient 
to result in amelioration of symptoms, e.g., treatment, healing, prevention or amelioration of the 
relevant medical condition, or an increase in rate of treatment, healing, prevention or 
25 amelioration of such conditions. When applied to an individual active ingredient, administered 
alone, a therapeutically effective dose refers to that ingredient alone. When applied to a 
combination, a therapeutically effective dose refers to combined amounts of the active 
ingredients that result in the therapeutic effect, whether administered in combination, serially or 
simultaneously. 

30 In practicing the method of treatment or use of the present invention, a therapeutically 

effective amount of protein or other active ingredient of the present invention is administered to 
a mammal having a condition to be treated. Protein or other active ingredient of the present 
invention may be administered in accordance with the method of the invention either alone or in 
combination with other therapies such as treatments employing cytokines, lymphokines or other 

35 hematopoietic factors. When co- administered with one or more cytokines, lymphokines or other 
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hematopoietic factors, protein or other active ingredient of the present invention may be 
administered either simultaneously with the cytokine(s), lymphokine(s), other hematopoietic 
factor(s), thrombolytic or anti-thrombotic factors, or sequentially. If a(lministered sequentially, 
the attending physician will decide on the appropriate sequence of administering protein or other 
5 active ingredient of the present invention in combination with cytokine(s), lymphokine(s), other 
hematopoietic factors), thrombolytic or anti-thrombotic factors. 

4.12.1 ROUTES OF ADMINISTRATION 

Suitable routes of aclministration may, for example, include oral, rectal, transmucosal, or 
1 0 intestinal administration; parenteral delivery, including intramuscular, subcutaneous, 
intramedullary injections, as well as intrathecal, direct intraventricular, intravenous, 
intraperitoneal, intranasal, or intraocular injections. Administration of protein or other active 
ingredient of the present invention used in the pharmaceutical composition or to practice the 
method of the present invention can be carried out in a variety of conventional ways, such as oral 
1 5 ingestion, inhalation, topical application or cutaneous, subcutaneous, intraperitoneal, parenteral 
or intravenous injection. Intravenous administration to the patient is preferred. 

Alternately, one may administer the compound in a local rather than systemic manner, for 
example, via injection of the compound directly into a arthritic joints or in fibrotic tissue, often 
in a depot or sustained release formulation. In order to prevent the scarring process frequently 
20 occurring as complication of glaucoma surgery, the compounds may be administered topically, 
-for example, as eye drops. Furthermore, one may administer the drug in a targeted drug delivery 
system, for example, in a liposome coated with a specific antibody, targeting, for example, 
arthritic or fibrotic tissue. The liposomes will be targeted to and taken up selectively by the 
afflicted tissue. 

25 The polypeptides of the invention are administered by any route that delivers an effective 

dosage to the desired site of action. The determination of a suitable route of adrninistration and 
an effective dosage for a particular indication is within the level of skill in the art. Preferably for 
wound treatment, one administers the therapeutic compound directly to the site. Suitable dosage 
ranges for the polypeptides of the invention can be extrapolated from these dosages or from 

30 similar studies in appropriate animal models. Dosages can then be adjusted as necessary by the 
clinician to provide maximal therapeutic benefit. 

4.12.2 COMPOSITIONS/FORMULATIONS 

Pharmaceutical compositions for use in accordance with the present invention thus may 
3 5 be formulated in a conventional manner using one or more physiologically acceptable carriers 
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comprising excipients and auxiliaries which facilitate processing of the active compounds into 
preparations which can be used pharmaceutical^. These pharmaceutical compositions may be 
manufactured in a manner that is itself known, e.g., by means of conventional mixing, 
dissolving, granulating, dragee-making, levigating, emulsifying, encapsulating, entrapping or 
5 lyophilizing processes. Proper formulation is dependent upon the route of administration 
chosen. When a therapeutically effective amount of protein or other active ingredient of the 
present invention is administered orally, protein or other active ingredient of the present 
invention will be in the form of a tablet, capsule, powder, solution or elixir. When administered 
in tablet form, the pharmaceutical composition of the invention may additionally contain a solid 
1 0 carrier such as a gelatin or an adjuvant. The tablet, capsule, and powder contain from about 5 to 
95% protein or other active ingredient of the present invention, and preferably from about 25 to 
90% protein or other active ingredient of the present invention. When administered in liquid 
form, a liquid carrier such as water, petroleum, oils of animal or plant origin such as peanut oil, 
mineral oil, soybean oil, or sesame oil, or synthetic oils may be added. The liquid form of the 
1 5 pharmaceutical composition may further contain physiological saline solution, dextrose or other 
saccharide solution, or glycols such as ethylene glycol, propylene glycol or polyethylene glycol. 
When administered in liquid form, the pharmaceutical composition contains from about 0.5 to 
90% by weight of protein or other active ingredient of the present invention, and preferably from 
about 1 to 50% protein or other active ingredient of the present invention. 
20 When a therapeutically effective amount of protein or other active ingredient of the 

present invention is administered by intravenous, cutaneous or subcutaneous injection, protein or 
other active ingredient of the present invention will be in the form of a pyro'gen-free, parenterally 
acceptable aqueous solution. The preparation of such parenterally acceptable protein or other 
active ingredient solutions, having due regard to pH, isotonicity, stability, and the like, is within 
25 the skill in the art. A preferred pharmaceutical composition for intravenous, cutaneous, or 
subcutaneous injection should contain, in addition to protein or other active ingredient of the 
present invention, an isotonic vehicle such as Sodium Chloride Injection, Ringer's Injection, 
Dextrose Injection, Dextrose and Sodium Chloride Injection, Lactated Ringer's Injection, or 
other vehicle as known in the art. The pharmaceutical composition of the present invention may 
30 also contain stabilizers, preservatives, buffers, antioxidants, or other additives known to those of 
skill in the art. For injection, the agents of the invention may be formulated in aqueous 
solutions, preferably in physiologically compatible buffers such as Hanks's solution, Ringer's 
solution, or physiological saline buffer. For transmucosal aclministration, penetrants appropriate 
to the barrier to be permeated are used in the formulation. Such penetrants are generally known 
35 in the art. 
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For oral administration, the compounds can be formulated readily by combining the 
active compounds with pharmaceutical^ acceptable carriers well known in the art. Such carriers 
enable the compounds of the invention to be formulated as tablets, pills, dragees, capsules, 
liquids, gels, syrups, slurries, suspensions and the like, for oral ingestion by a patient to be 
treated. Pharmaceutical preparations for oral use can be obtained from a solid excipient, 
optionally grinding a resulting mixture, and processing the mixture of granules, after adding 
suitable auxiliaries, if desired, to obtain tablets or dragee cores. Suitable excipients are, in 
particular, fillers such as sugars, including lactose, sucrose, mannitol, or sorbitol; cellulose 
preparations such as, for example, maize starch, wheat starch, rice starch, potato starch, gelatin, 
gum tragacanth, methyl cellulose, hydroxypropylmethyl-cellulose, sodium 
carboxymethylcellulose, and/or polyvinylpyrrolidone (PVP). If desired, disintegrating agents 
may be added, such as the cross-linked polyvinyl pyrrolidone, agar, or alginic acid or a salt 
thereof such as sodium alginate. Dragee cores are provided with suitable coatings. For this 
purpose, concentrated sugar solutions may be used, which may optionally contain gum arable, 
talc, polyvinyl pyrrolidone, carbopol gel, polyethylene glycol, and/or titanium dioxide, lacquer 
solutions, and suitable organic solvents or solvent mixtures. Dyestuffs or pigments may be 
added to the tablets or dragee coatings for identification or to characterize different combinations 

of active compound doses. 

Pharmaceutical preparations which can be used orally include push-fit capsules made of 
gelatin, as well as soft, sealed capsules made of gelatin and a plasticizer, such as glycerol or 
sorbitol. The push-fit capsules can contain the active ingredients in admixture with filler such as 
lactose, binders such as starches, and/or lubricants such as talc or magnesium stearate and, 
optionally, stabilizers. In soft capsules, the active compounds may be dissolved or suspended in 
suitable liquids, such as fatty oils, liquid paraffin, or liquid polyethylene glycols. In addition, 
stabilizers may be added. All formulations for oral administration should be in dosages suitable 
for such administration. For buccal administration, the compositions may take the form of 
tablets or lozenges formulated in conventional manner. 

For administration by inhalation, the compounds for use according to the present 
invention are conveniently delivered in the form of an aerosol spray presentation from 
pressurized packs or a nebuliser, with the use of a suitable propellant, e.g., 
dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide or 
other suitable gas. In the case of a pressurized aerosol the dosage unit may be determined by 
providing a valve to deliver a metered amount Capsules and cartridges of, e.g., gelatin for use 
in an inhaler or insufflator may be formulated containing a powder mix of the compound and a 
suitable powder base such as lactose or starch. The compounds may be formulated for parenteral 
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administration by injection, e.g., by bolus injection or continuous infusion. Fonnulations for 
injection may be presented in unit dosage form, e.g., in ampules or in multi-dose containers, with 
an added preservative. The compositions may take such forms as suspensions, solutions or 
emulsions in oily or aqueous vehicles, and may contain fortnulatory agents such as suspending, 

stabilizing and/or dispersing agents. 

Pharmaceutical formulations for parenteral administration include aqueous solutions of 
the active compounds in water-soluble form. Additionally, suspensions of the active compounds 
may be prepared as appropriate oily injection suspensions. Suitable lipophilic solvents or 
vehicles include fatty oils such as sesame oil, or synthetic fatty acid esters, such as ethyl oleate or 
triglycerides, or liposomes. Aqueous injection suspensions may contain substances which 
increase the viscosity of the suspension, such as sodium carboxymethyl cellulose, sorbitol, or 
dextran. Optionally, the suspension may also contain suitable stabilizers or agents which 
increase the solubility of the compounds to allow for the preparation of highly concentrated 
solutions. Alternatively, the active ingredient may be in powder form for constitution with a 
suitable vehicle, e.g., sterile pyrogen-free water, before use. 

The compounds may also be formulated in rectal compositions such as suppositories or 
retention enemas, e.g. , containing conventional suppository bases such as cocoa butter or other 
glycerides. In addition to the formulations described previously, the compounds may also be 
formulated as a depot preparation. Such long acting fonnulations may be administered by 
implantation (for example subcutaneously or intramuscularly) or by intramuscular injection. 
Thus, for example, the compounds may be formulated with suitable polymeric or hydrophobic 
materials (for example as an emulsion in an acceptable oil) or ion exchange resins, or as 
sparingly soluble derivatives, for example, as a sparingly soluble salt. 

A pharmaceutical carrier for the hydrophobic compounds of the invention is a co-solvent 
system comprising benzyl alcohol, a nonpolar surfactant, a water-miscible organic polymer, and 
an aqueous phase. The co-solvent system may be the VPD co-solvent system. VPD is a solution 
of 3% w/v benzyl alcohol, 8% w/v of the nonpolar surfactant polysorbate 80, and 65% w/v 
polyethylene glycol 300, made up to volume in absolute ethanol. The VPD co-solvent system 
(VPD:5W) consists of VPD diluted 1:1 with a 5% dextrose in water solution. This co-solvent 
system dissolves hydrophobic compounds well, and itself produces low toxicity upon systemic 
administration. Naturally, the proportions of a co-solvent system may be varied considerably 
without destroying its solubility and toxicity characteristics. Furthermore, the identity of the 
co-solvent components may be vaned: for example, other low-toxicity nonpolar surfactants may 
be used instead of polysorbate 80; the fraction size of polyethylene glycol may be varied; other 
biocompatible polymers may replace polyethylene glycol, e.g. polyvinyl pyrrolidone; and other 
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sugars or polysaccharides may substitute for dextrose. Alternatively, other delivery systems for 
hydrophobic pharmaceutical compounds may be employed. Liposomes and emulsions are well 
known examples of delivery vehicles or carriers for hydrophobic drugs. Certain organic solvents 
such as dimethylsulfoxide also may be employed, although usually at the cost of greater toxicity. 
5 Additionally, the compounds may be delivered using a sustained-release system, such as 
semipermeable matrices of solid hydrophobic polymers containing the therapeutic agent. 
Various types of sustained-release materials have been established and are well known by those 
skilled in the art. Sustained-release capsules may, depending on their chemical nature, release the 
compounds for a few weeks up to over 100 days. Depending on the chemical nature and the 
10 biological stability of the therapeutic reagent, additional strategies for protein or other active 
ingredient stabilization may be employed. 

The pharmaceutical compositions also may comprise suitable solid or gel phase carriers 
or excipients. Examples of such carriers or excipients include but are not limited to calcium 
carbonate, calcium phosphate, various sugars, starches, cellulose derivatives, gelatin, and 
1 5 polymers such as polyethylene glycols. Many of the active ingredients of the invention may be 
provided as salts with pharmaceutical^ compatible counter ions. Such pharmaceutically 
acceptable base addition salts are those salts which retain the biological effectiveness and 
properties of the free acids and which are obtained by reaction with inorganic or organic bases 
such as sodium hydroxide, magnesium hydroxide, ammonia, trialkylamine, dialkylamine, 
20 monoalkylamine, dibasic amino acids, sodium acetate, potassium benzoate, triethanol amine and 
the like. 

The pharmaceutical composition of the invention may be in the form of a complex of the 
protein(s) or other active ingredients) of present invention along with protein or peptide 
antigens. The protein and/or peptide antigen will deliver a stimulatory signal to both B and T 

25 lymphocytes. B lymphocytes will respond to antigen through their surface immunoglobulin 
receptor. T lymphocytes will respond to antigen through the T cell receptor (TCR) following 
presentation of the antigen by MHC proteins. MHC and structurally related proteins including 
those encoded by class I and class II MHC genes on host cells will serve to present the peptide 
antigen(s) to T lymphocytes. The antigen components could also be supplied as purified 

30 MHC-peptide complexes alone or with co-stimulatory molecules that can directly signal T cells. 
Alternatively antibodies able to bind surface immunoglobulin and other molecules on B cells as 
well as antibodies able to bind the TCR and other molecules on T cells can be combined with the 
pharmaceutical composition of the invention. 

The pharmaceutical composition of the invention may be in the form of a liposome in 

3 5 which protein of the present invention is combined, in addition to other pharmaceutically 
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acceptable carriers, with amphipathic agents such as lipids which exist in aggregated form as 
micelles, insoluble monolayers, liquid crystals, or lamellar layers in aqueous solution. Suitable 
lipids for liposomal formulation include, without limitation, monoglycerides, diglycerides, 
sulfatides, lysolecithins, phospholipids, saponin, bile acids, and the like. Preparation of such 
liposomal formulations is within the level of skill in the art, as disclosed, for example, in U.S. 
Patent Nos. 4,235,871 ; 4,501,728; 4,837,028; and 4,737,323, all of which are incorporated 
herein by reference. 

The amount of protein or other active ingredient of the present invention in the 
pharmaceutical composition of the present invention will depend upon the nature and severity of 
the condition being treated, and on the nature of prior treatments which the patient has 
undergone. Ultimately, the attending physician will decide the amount of protein or other active 
ingredient of the present invention with which to treat each individual patient. Initially, the 
attending physician will administer low doses of protein or other active ingredient of the present 
invention and observe the patient's response. Larger doses of protein or other active ingredient 
of the present invention may be administered until the optimal therapeutic effect is obtained for 
the patient, and at that point the dosage is not increased further. It is contemplated that the 
various pharmaceutical compositions used to practice the method of the present invention should 
contain about 0.01 pg to about 100 mg (preferably about 0.1 pg to about 10 mg, more preferably 
about 0.1 pg to about 1 mg) of protein or other active ingredient of the present invention per kg 
body weight For compositions of the present invention which are useful for bone, cartilage, 
tendon or ligament regeneration, the therapeutic method includes administering the composition 
topically, systematically, or locally as an implant or device. When administered, the therapeutic 
composition for use in this invention is, of course, in a pyrogen-free, physiologically acceptable 
form. Further, the composition may desirably be encapsulated or injected in a viscous form for 
delivery to the site of bone, cartilage or tissue damage. Topical administration may be suitable 
for wound healing and tissue repair. Therapeutically useful agents other than a protein or other 
active ingredient of the invention which may also optionally be included in the composition as 
described above, may alternatively or additionally, be administered simultaneously or 
sequentially with the composition in the methods of the invention. Preferably for bone and/or 
cartilage formation, the composition would include a matrix capable of delivering the 
protein-containing or other active ingredient-containing composition to the site of bone and/or 
cartilage damage., providing a structure for the developing bone and cartilage and optimally 
capable of being resorbed into the body. Such matrices may be formed of materials presently in 
use for other implanted medical applications. 
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The choice of matrix material is based on biocompatibility, biodegradability, mechanical 
properties, cosmetic appearance and interface properties. The particular application of the 
compositions will define the appropriate formulation. Potential matrices for the compositions 
may be biodegradable and chemically defined calcium sulfate, tricalcium phosphate, 
5 hydroxyapatite, polylactic acid, polyglycolic acid and poly anhydrides. Other potential materials 
are biodegradable and biologically well-defined, such as bone or dermal collagen. Further 
matrices are comprised of pure proteins or extracellular matrix components. Other potential 
matrices are nonbiodegradable and chemically defined, such as sintered hydroxyapatite, bioglass, 
aluminates, or other ceramics. Matrices may be comprised of combinations of any of the above 
10 mentioned types of material, such as polylactic acid and hydroxyapatite or collagen and 
tricalcium phosphate. The bioceramics may be altered in composition, such as in 
cdcium-aluminate-phosphate and processing to alter pore size, particle size, particle shape, and 
biodegradability. Presently preferred is a 50:50 (mole weight) copolymer of lactic acid and 
glycolic acid in the form of porous particles having diameters ranging from 150 to 800 microns. 
1 5 In some applications, it will be useful to utilize a sequestering agent, such as carboxymethyl 

cellulose or autologous blood clot, to prevent the protein compositions from disassociating from 
the matrix. 

A preferred family of sequestering agents is cellulosic materials such as alkylcelluloses 
(including hydroxyalkylcelluloses), including methylcellulose, ethylcellulose, 

20 hydroxyethyicellulose, hydroxypropylcellulose, hydroxypropyl-methylcellulose, and 

carboxymethylcellulose, the most preferred being cationic salts of carboxymethylcellulose 
(CMC). Other preferred sequestering agents include hyaluronic acid, sodium alginate, 
polyethylene glycol), polyoxyethylene oxide, carboxyvinyl polymer and polyvinyl alcohol). 
The amount of sequestering agent useful herein is 0.5-20 wt %, preferably 1-10 wt % based on 

25 total formulation weight, which represents the amount necessary to prevent desorption of the 

protein from the polymer matrix and to provide appropriate handling of the composition, yet not 
so much that the progenitor cells are prevented from infiltrating the matrix, thereby providing the 
protein the opportunity to assist the osteogenic activity of the progenitor cells. In further 
compositions, proteins or other active ingredients of the invention may be combined with other 

30 agents beneficial to the treatment of the bone and/or cartilage defect, wound, or tissue in 

question. These agents include various growth factors such as epidermal growth factor (EOF), 
platelet derived growth factor (PDGF), transforming growth factors (TGF-a and TGF-p), and 

insulin-like growth factor (IGF). 

The therapeutic compositions are also presently valuable for veterinary applications. 
35 Particularly domestic animals and thoroughbred horses, in addition to humans, are desired 
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patients for such treatment with proteins or other active ingredients of the present invention. The 
dosage regimen of a protein-conuining pharmaceutical composition to be used in tissue 
regeneration will be determined by the attending physician considering various factors which 
modify the action of the proteins, e.g., amount of tissue weight desired to be formed, the site of 
damage, the condition of the damaged tissue, the size of a wound, type of damaged tissue (e.g., 
bone), the patient's age, sex, and diet, the severity of any infection, time of adniinistration and 
other clinical factors. The dosage may vary with the type of matrix used in the reconstitution 
and with inclusion of other proteins in the pharmaceutical composition. For example, the 
addition of other known growth factors, such as IGF I (insulin like growth factor I), to the final 
composition, may also effect the dosage. Progress can be monitored by periodic assessment of 
tissue/bone growth and/or repair, for example, X-rays, histomorphometric determinations and 

tetracycline labeling. 

Polynucleotides of the present invention can also be used for gene therapy. Such 
polynucleotides can be introduced either in vivo or ex vivo into cells for expression in a 
15 mammalian subject Polynucleotides of the invention may also be a(immistered by other known 
methods for introduction of nucleic acid into a cell or organism (including, without limitation, in 
the form of viral vectors or naked DNA). Cells may also be cultured ex vivo in the presence of 
proteins of the present invention in order to proliferate or to produce a desired effect on or 
activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 



10 



20 



4.12.3 EFFECTIVE DOSAGE 

Pharmaceutical compositions suitable for use in the present invention include 
compositions wherein the active ingredients are contained in an effective amount to achieve its 
intended purpose. More specifically, a therapeutically effective amount means an amount 

25 effective to prevent development of or to alleviate the existing symptoms of the subject being 
treated. Determination of the effective amount is well within the capability of those skilled in 
the art, especially in light of the detailed disclosure provided herein. For any compound used in 
the method of the invention, the therapeutically effective dose can be estimated initially from 
appropriate in vitro assays. For example, a dose can be formulated in animal models to achieve a 

30 circulating concentration range that can be used to more accurately determine useful doses in 
humans. For example, a dose can be formulated in animal models to achieve a circulating 
concentration range that includes the IC 50 as determined in cell culture (i.e., the concentration of 
the test compound which achieves a half-maximal inhibition of the protein's biological activity). 
Such information can be used to more accurately determine useful doses in humans. 
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A therapeutically effective dose refers to that amount of the compound that results in 
amelioration of symptoms or a prolongation of survival in a patient. Toxicity and therapeutic 
efficacy of such compounds can be determined by standard pharmaceutical procedures in cell 
cultures or experimental animals, e.g., for determining the LD50 (the dose lethal to 50% of the 
5 population) and the ED 50 (the dose therapeutically effective in 50% of the population). The dose 
ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the 
ratio between LD 50 and ED 50 . Compounds which exhibit high therapeutic indices are preferred. 
The data obtained from these cell culture assays and animal studies can be used in formulating a 
range of dosage for use in human. The dosage of such compounds lies preferably within a range 
10 of circulating concentrations that include the ED 50 with little or no toxicity. The dosage may 

vary within this range depending upon the dosage form employed and the route of adniinistration 
utilized. The exact formulation, route of administration and dosage can be chosen by the 
individual physician in view of the patient's condition. See, e.g., Fingl et al., 1975, in "The 
Pharmacological Basis of Therapeutics", Ch. 1 p.l. Dosage amount and interval may be adjusted 
1 5 individually to provide plasma levels of the active moiety which are sufficient to maintain the 
desired effects, or minimal effective concentration (MEC). The MEC will vary for each 
compound but can be estimated from in vitro data. Dosages necessary to achieve the MEC will 
depend on individual characteristics and route of administration. However, HPLC assays or 
bioassays can be used to determine plasma concentrations. 
20 Dosage intervals can also be determined using MEC value. Compounds should be 

administered using a regimen which maintains plasma levels above the MEC for 10-90% of the 
• time, preferably between 30-90% and most preferably between 50-90%. In cases of local 
administration or selective uptake, the effective local concentration of the drug may not be 
related to plasma concentration. 
25 An exemplary dosage regimen for polypeptides or other compositions of the invention 

will be in the range of about 0.01 ug/kg to 100 mg/kg of body weight daily, with the preferred 
dose being about 0.1 ug/kg to 25 mg/kg of patient body weight daily, varying in adults and 
children. Dosing may be once daily, or equivalent doses may be delivered at longer or shorter 
intervals. 

30 The amount of composition administered will, of course, be dependent on the subject 

being treated, on the subject's age and weight, the severity of the affliction, the manner of 
administration and the judgment of the prescribing physician. 
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The compositions may, if desired, be presented in a pack or dispenser device which may 
contain one or more unit dosage forms containing the active ingredient. The pack may, for 
example, comprise metal or plastic foil, such as a blister pack. The pack or dispenser device may 
be accompanied by instructions for administration. Compositions comprising a compound of the 
5 invention formulated in a compatible pharmaceutical carrier may also be prepared, placed in an 
appropriate container, and labeled for treatment of an indicated condition. 



4.13 ANTIBODIES 

Also included in the invention are antibodies to proteins, or fragments of proteins of the 

1 0 invention. The term "antibody" as used herein refers to immunoglobulin molecules and 

immunologically active portions of immunoglobulin (Ig) molecules, Le., molecules that contain 
an antigen binding site that specifically binds (immunoreacts with) an antigen. Such antibodies 
include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, F ab , Fab- and 
fragments, and an F ab expression library. In general, an antibody molecule obtained from 

1 5 humans relates to any of the classes IgG, IgM, IgA, IgE and IgD, which differ from one another 
by the nature of the heavy chain present in the molecule. Certain classes have subclasses as well, 
such as IgGi, IgG 2 , and others. Furthermore, in humans, the light chain may be a kappa chain or 
a lambda chain. Reference herein to antibodies includes a reference to all such classes, 
subclasses and types of human antibody species. 

20 An isolated related protein of the invention may be intended to serve as an antigen, or a 

portion or fragment thereof, and additionally can be used as an immunogen to generate 
antibodies that immunospecifically bind the antigen, using standard techniques for polyclonal 
and monoclonal antibody preparation. The full-length protein can be used or, alternatively, the 
invention provides antigenic peptide fragments of the antigen for use as immunogens. An 

25 antigenic peptide fragment comprises at least 6 amino acid residues of the amino acid sequence 
of the full length protein, (for example the amino acid sequence shown in SEQ ID NO: 1351), 
and encompasses an epitope thereof such that an antibody raised against the peptide forms a 
specific immune complex with the full length protein or with any fragment that contains the 
epitope. Preferably, the antigenic peptide comprises at least 10 amino acid residues, or at least 

30 15 amino acid residues, or at least 20 amino acid residues, or at least 30 amino acid residues. 
Preferred epitopes encompassed by the antigenic peptide are regions of the protein that are 
located on its surface; commonly these are hydrophilic regions. 

In certain embodiments of the invention, at least one epitope encompassed by the 
antigenic peptide is a region of -related protein that is located on the surface of the protein, e.g., a 

3 5 hydrophilic region. A hydrophobicity analysis of the human related protein sequence will 
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indicate which regions of a related protein are particularly hydrophilic and, therefore, are likely 
to encode surface residues useful for targeting antibody production. As a means for targeting 
antibody production, hydropathy plots showing regions of hydrophilicity and hydrophobicity 
may be generated by any method well known in the art, including, for example, the Kyte 

5 Doolittle or the Hopp Woods methods, either with or without Fourier transformation. See, e.g. , 
Hopp and Woods, 1981, Proc. Nat. Acad. Set USA 78: 3824-3828; Kyte and Doolittle 1982, 1 
Mol Biol 157: 105-142, each of which is incorporated herein by reference in its entirety. 
Antibodies that are specific for one or more domains within an antigenic protein, or derivatives, 
fragments, analogs or homologs thereof, are also provided herein. 

10 A protein of the invention, or a derivative, fragment, analog, homolog or ortholog 

thereof, may be utilized as an immunogen in the generation of antibodies that 
immunospecifically bind these protein components. 

Various procedures known within the art may be used for the production of polyclonal or 
monoclonal antibodies directed against a protein of the invention, or against derivatives, 

1 5 fragments, analogs homologs or orthologs thereof (see, for example, Antibodies: A Laboratory 
Manual, Harlow E, and Lane D, 1 988, Cold Spring Harbor Laboratory Press, Cold Spring 
Harbor, NY, incorporated herein by reference). Some of these antibodies are discussed below. 

5.13.1 Polyclonal Antibodies 

20 For the production of polyclonal antibodies, various suitable host animals (e.g. 9 rabbit, 

goat, mouse or other mammal) may be immunized by one or more injections with the native 
protein, a synthetic variant thereof, or a derivative of the foregoing. An appropriate 
immunogenic preparation can contain, for example, the naturally occurring immunogenic 
protein, a chemically synthesized polypeptide representing the immunogenic protein, or a 

25 recombinants expressed immunogenic protein. Furthermore, the protein may be conjugated to 
a second protein known to be immunogenic in the mammal being immunized. Examples of such 
immunogenic proteins include but are not limited to keyhole limpet hemocyanin, serum albumin, 
bovine thyroglobulin, and soybean trypsin inhibitor. The preparation can further include an 
adjuvant. Various adjuvants used to increase the immunological response include, but are not 

30 limited to, Freund's (complete and incomplete), mineral gels (e.g., aluminum hydroxide), surface 
active substances (e.g., lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, 
dinitrophenol, etc.), adjuvants usable in humans such as Bacille Calmette-Guerin and 
Corynebacterium parvum, or similar immunostimulatory agents. Additional examples of 
adjuvants which can be employed include MPL-TDM adjuvant (monophosphoryl Lipid A, 

35 synthetic trehalose dicorynomycolate). 
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The polyclonal antibody molecules directed against the immunogenic protein can be 
isolated from the mammal (e.g. , from the blood) and further purified by well known techniques, 
such as affinity chromatography using protein A or protein G, which provide primarily the IgG 
fraction of immune serum. Subsequently, or alternatively, the specific antigen which is the 
target of the immunoglobulin sought, or an epitope thereof, may be immobilized on a column to 
purify the immune specific antibody by immunoaffinity chromatography. Purification of 
immunoglobulins is discussed, for example, by D. Wilkinson (The Scientist, published by The 
Scientist, Inc., Philadelphia PA, Vol. 14, No. 8 (April 17, 2000), pp. 25-28). 



1 0 5.13.2 Monoclonal Antibodies 

The term "monoclonal antibody" (MAb) or "monoclonal antibody composition", as used 
herein, refers to a population of antibody molecules that contain only one molecular species of 
antibody molecule consisting of a unique light chain gene product and a unique heavy chain 
gene product In particular, the complementarity determining regions (CDRs) of the monoclonal 

15 antibody are identical in all the molecules of the population. MAbs thus contain an antigen 

binding site capable of immunoreacting with a particular epitope of the antigen characterized by 
a unique binding affinity for it. 

Monoclonal antibodies can be prepared using hybridoma methods, such as those 
described by Kohler and Milstein, Nature. 256:495 (1 975). In a hybridoma method, a mouse, 

20 hamster, or other appropriate host animal, is typically immunized with an immunizing agent to 
elicit lymphocytes that produce or are capable of producing antibodies that will specifically bind 
to the immunizing agent. Alternatively, the lymphocytes can be immunized in vitro. 

The immunizing agent will typically include the protein antigen, a fragment thereof or a 
fusion protein thereof. Generally, either peripheral blood lymphocytes are used if cells of human 

25 origin are desired, or spleen cells or lymph node cells are used if non-human mammalian sources 
are desired. The lymphocytes are then fused with an immortalized cell line using a suitable 
fusing agent, such as polyethylene glycol, to form a hybridoma cell (Coding, Monoclonal 
Antibodies: Principles and Practice. Academic Press, (1986) pp. 59-103). Immortalized cell 
lines are usually transformed mammalian cells, particularly myeloma cells of rodent, bovine and 

30 human origin. Usually, rat or mouse myeloma cell lines are employed. The hybridoma cells can 
be cultured in a suitable culture medium that preferably contains one or more substances that 
inhibit the growth or survival of the unfused, immortalized cells. For example, if the parental 
cells lack the enzyme hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT), the 
culture medium for the hybridomas typically will include hypoxanthine, aminopterin, and 

35 thymidine ("HAT medium"), which substances prevent the growth of HGPRT-deficient cells. 
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Preferred immortalized cell lines are those that fuse efficiently, support stable high level 
expression of antibody by the selected antibody-producing cells, and are sensitive to a medium 
such as HAT medium. More preferred immortalized ceU lines are murine myeloma lines, which 
can be obtained, for instance, from the Salk Institute Cell Distribution Center, San Diego, 
5 California and the American Type Culture Collection, Manassas, Virginia. Human myeloma and 
mouse-human heteromyeloma cell lines also have been described for the production of human 
monoclonal antibodies (Kozbor, J. Immunol.. 133:3001 (1984); Brodeur et al., Monoclonal 
Antihodv Production Techniques and Applications , Marcel Dekker, Inc., New York, (1987) pp. 
51-63). 

1 0 The culture medium in which the hybridoma cells are cultured can then be assayed for 

the presence of monoclonal antibodies directed against the antigen. Preferably, the binding 
specificity of monoclonal antibodies produced by the hybridoma cells is determined by 
immunoprecipitation or by an in vitro binding assay, such as radioimmunoassay (RIA) or 
enzyme-linked immunoabsorbent assay (ELISA). Such techniques and assays are known in the 

1 5 art. The binding affinity of the monoclonal antibody can, for example, be determined by the 
Scatchard analysis of Munson and Pollard, Anal. Biochem., 107:220 (1980). Preferably, 
antibodies having a high degree of specificity and a high binding affinity for the target antigen 
are isolated. 

After the desired hybridoma cells are identified, the clones can be subcloned by limiting 

20 dilution procedures and grown by standard methods. Suitable culture media for this purpose 
include, for example, Dulbecco's Modified Eagle's Medium and RPMI-1640 medium. 
Alternatively, the hybridoma cells can be grown in vivo as ascites in a mammal. 

The monoclonal antibodies secreted by the subclones can be isolated or purified from the 
culture medium or ascites fluid by conventional immunoglobulin purification procedures such 

25 as, for example, protein A-Sepharose, hydroxylapatite chromatography, gel electrophoresis, 
dialysis, or affinity chromatography. 

The monoclonal antibodies can also be made by recombinant DNA methods, such as 
those described in U.S. Patent No. 4,816,567. DNA encoding the monoclonal antibodies of the 
invention can be readily isolated and sequenced using conventional procedures (e.g., by using 

30 oligonucleotide probes that are capable of binding specifically to genes encoding the heavy and 
light chains of murine antibodies). The hybridoma cells of the invention serve as a preferred 
source of such DNA. Once isolated, the DNA can be placed into expression vectors, which are 
then transfected into host cells such as simian COS cells, Chinese hamster ovary (CHO) cells, or 
myeloma cells that do not otherwise produce immunoglobulin protein, to obtain the synthesis of 

35 monoclonal antibodies in the recombinant host cells. The DNA also can be modified, for 
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example, by substituting the coding sequence for human heavy and light chain constant domains 
in place of the homologous murine sequences (U.S. Patent No. 4,816,567; Morrison, Nature 368, 
812-13 (1994)) or by covalently joining to the immunoglobulin coding sequence all or part of the 
coding sequence for a non-immunoglobulin polypeptide. Such a non-immunoglobulin 
5 polypeptide can be substituted for the constant domains of an antibody of the invention, or can 
be substituted for the variable domains of one antigen-combining site of an antibody of the 
invention to create a chimeric bivalent antibody. 

5.13.2 Humanized Antibodies 

1 0 The antibodies directed against the protein antigens of the invention can further comprise 

humanized antibodies or human antibodies. These antibodies are suitable for administration to 
humans without engendering an immune response by the human against the aoniinistered 
immunoglobulin. Humanized forms of antibodies are chimeric immunoglobulins, 
immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab', F(ab-) 2 or other antigen- 

1 5 binding subsequences of antibodies) that are principally comprised of the sequence of a human 
immunoglobulin, and contain minimal sequence derived from a non-human immunoglobulin. 
Humanization can be performed following the method of Winter and co-workers (Jones et al., 
Nature, 321:522-525 (1986); Riechmann et al., Nature , 332:323-327 (1988); Verhoeyen et al., 
Science , 239:1534-1536 (1988)), by substituting rodent CDRs or CDR sequences for the 

20 corresponding sequences of a human antibody. (See also U.S. Patent No. 5,225,539.) In some 
instances, Fv framework residues of the human immunoglobulin are replaced by corresponding 
non-human residues. Humanized antibodies can also comprise residues which are found neither 
in the recipient antibody nor in the imported CDR or framework sequences. In general, the 
humanized antibody will comprise substantially all of at least one, and typically two, variable 

25 domains, in which all or substantially all of the CDR regions correspond to those of a non-human 
immunoglobulin and all or substantially all of the framework regions are those of a human 
immunoglobulin consensus sequence. The humanized antibody optimally also will comprise at 
least a portion of an immunoglobulin constant region (Fc), typically that of a human 
immunoglobulin (Jones et al., 1986; Riechmann et al., 1988; and Presta, Curr. Op. Struct. Biol. , 

30 2:593-596 (1992)). 

5.13.3 Human Antibodies 

Fully human antibodies relate to antibody molecules in which essentially the entire 
sequences of both the light chain and the heavy chain, including the CDRs, arise from human 
35 genes. Such antibodies are termed "human antibodies", or "fully human antibodies" herein. 
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Human monoclonal antibodies can be prepared by the trioma technique; the human B-cell 
hybridoma technique (see Kozbor, et al., 1983 Immunol Today 4: 72) and the EBV hybridoma 
technique to produce human monoclonal antibodies (see Cole, et al., 1985 In: Monoclonal 
ANTIBODIES AND Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). Human monoclonal 
5 antibodies may be utilized in the practice of the present invention and may be produced by using 
human hybridomas (see Cote, et al., 1983. Proc Natl Acad Sci USA 80: 2026-2030) or by 
transforming human B-cells with Epstein Ban Virus in vitro (see Cole, et al., 1 985 In: 
Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). 

In addition, human antibodies can also be produced using additional techniques, 
10 including phage display libraries (Hoogenboom and Winter, J. Mol. Biol. , 222:381 (1991 ); 
Marks et al., J. Mol. Biol , 222:581 (1991)). Similarly, human antibodies can be made by 
introducing human immunoglobulin loci into transgenic animals, e.g., mice in which the 
endogenous immunoglobulin genes have been partially or completely inactivated. Upon 
challenge, human antibody production is observed, which closely resembles that seen in humans 
15 in all respects, including gene rearrangement, assembly, and antibody repertoire. This approach 
is described, for example, in U.S. Patent Nos. 5,545,807; 5,545,806; 5,569,825; 5,625,126; 
5,633,425; 5,661,016, and in Marks et al. f Bio/Technology 10, 779-783 (1992)); Lonberg et al. 
(Nature 368 856-859 (1994)); Morrison ( Nature 368, 812-13 (1994)); Fishwild et al,( Nature 
Biotechnology 14, 845-51 (1996)); Neuberger (Nature Biotechnology 14, 826 (1996)); and 
20 Lonberg and Huszar (Intern. Rev. Immunol. 13 65-93 (1 995)). 

Human antibodies may additionally be produced using transgenic nonhuman animals 
which are modified so as to produce fully human antibodies rather than the animal's endogenous 
antibodies in response to challenge by an antigen. (See PCT publication WO94/02602). The 
endogenous genes encoding the heavy and light immunoglobulin chains in the nonhuman host 
25 have been incapacitated, and active loci encoding human heavy and light chain immunoglobulins 
are inserted into the host's genome. The human genes are incorporated, for example, using yeast 
artificial chromosomes containing the requisite human DNA segments. An animal which 
provides all the desired modifications is then obtained as progeny by crossbreeding intermediate 
transgenic animals containing fewer than the full complement of the modifications. The 
30 preferred embodiment of such a nonhuman animal is a mouse, and is termed the Xenomouse™ 
as disclosed in PCT publications WO 96/33735 and WO 96/34096. This animal produces B 
cells which secrete fully human immunoglobulins. The antibodies can be obtained directly from 
the animal after immunization with an immunogen of interest, as, for example, a preparation of a 
polyclonal antibody, or alternatively from immortalized B cells derived from the animal, such as 
3 5 hybridomas producing monoclonal antibodies. Additionally, the genes encoding the 
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immunoglobulins with human variable regions can be recovered and expressed to obtain the 
antibodies directly, or can be further modified to obtain analogs of antibodies such as, for 
example, single chain Fv molecules. 

An example of a method of producing a nonhuman host, exemplified as a mouse, lacking 
5 expression of an endogenous immunoglobulin heavy chain is disclosed in U.S. Patent No. 

5,939,598. It can be obtained by a method including deleting the J segment genes from at least 
one endogenous heavy chain locus in an embryonic stem cell to prevent rearrangement of the 
locus and to prevent formation of a transcript of a rearranged immunoglobulin heavy chain locus, 
the deletion being effected by a targeting vector containing a gene encoding a selectable marker; 
10 and producing from the embryonic stem cell a transgenic mouse whose somatic and germ cells 
contain the gene encoding the selectable marker. 

A method for producing an antibody of interest, such as a human antibody, is disclosed in 
U.S. Patent No. 5,916,771. It includes introducing an expression vector that contains a 
nucleotide sequence encoding a heavy chain into one mammalian host cell in culture, introducing 
1 5 an expression vector containing a nucleotide sequence encoding a light chain into another 

mammalian host cell, and fusing the two cells to form a hybrid cell. The hybrid cell expresses an 
antibody containing the heavy chain and the light chain. 

In a further improvement on this procedure, a method for identifying a clinically relevant 
epitope on an immunogen, and a correlative method for selecting an antibody that binds 
20 immunospecifically to the relevant epitope with high affinity, are disclosed in PCT publication 
WO 99/53049. 

5.13.4 Fab Fragments and Single Chain Antibodies 

According to the invention, techniques can be adapted for the production of single-chain 
25 antibodies specific to an antigenic protein of the invention (see e.g., U.S. Patent No. 4,946,778). 
In addition, methods can be adapted for the construction of Fab expression libraries (see e.g., 
Huse, et al., 1989 Science 246: 1275-1281) to allow rapid and effective identification of 
monoclonal F ab fragments with the desired specificity for a protein or derivatives, fragments, 
analogs or homologs thereof. Antibody fragments that contain the idiotypes to a protein antigen 
30 may be produced by techniques known in the art including, but not limited to: (i) an.F w 

fragment produced by pepsin digestion of an antibody molecule; (ii) an F ab fragment generated 
by reducing the disulfide bridges of an F (ab . )2 fragment; (iii) an Fab fragment generated by the 
treatment of the antibody molecule with papain and a reducing agent and (iv) F v fragments. 

35 5.13.5 Bispecific Antibodies 
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Bispecific antibodies are monoclonal, preferably human or humanized, antibodies that 
have binding specificities for at least two different antigens. In the present case, one of the 
binding specificities is for an antigenic protein of the invention. The second binding target is any 
other antigen, and advantageously is a cell-surface protein or receptor or receptor subunit. 
5 Methods for making bispecific antibodies are known in the art. Traditionally, the 

recombinant production of bispecific antibodies is based on the co-expression of two 
immunoglobulin heavy-chainAight-chain pairs, where the two heavy chains have different 
specificities (Milstein and Cuello, Nature, 305:537-539 (1983)). Because of the random 
assortment of immunoglobulin heavy and light chains, these hybridomas (quadromas) produce a 
10 potential mixture often different antibody molecules, of which only one has the correct 

bispecific structure. The purification of the correct molecule is usually accomplished by affinity 
chromatography steps. Similar procedures are disclosed in WO 93/08829, published 13 May 
1993, and inTraunecker et al., 1991 EMBOJ., 10:3655-3659. 

Antibody variable domains with the desired binding specificities (antibody-antigen 
15 combining sites) can be fused to immunoglobulin constant domain sequences. The fusion 

preferably is with an immunoglobulin heavy-chain constant domain, comprising at least part of 
the hinge, CH2, and CH3 regions. It is preferred to have the first heavy-chain constant region 
(CHI) containing the site necessary for light-chain binding present in at least one of the fusions. 
DNAs encoding the immunoglobulin heavy-chain fusions and, if desired, the immunoglobulin 
20 light chain, are inserted into separate expression vectors, and are co-transfected into a suitable 
host organism. For further details of generating bispecific antibodies see, for example, Suresh et 
a l , Methods in E nzvmology, 121:210 (1986). 

~ According to another approach described in WO 96/2701 1, the interface between a pair 
of antibody molecules can be engineered to maximize the percentage of heterodimers which are 
25 recovered from recombinant cell culture. The preferred interface comprises at least a part of the 
CH3 region of an antibody constant domain. In this method, one or more small amino acid side 
chains from the interface of the first antibody molecule are replaced with larger side chains (e.g. 
tyrosine or tryptophan). Compensatory "cavities" of identical or similar size to the large side 
chain(s) are created on the interface of the second antibody molecule by replacing large amino 
30 acid side chains with smaller ones (e.g. alanine or threonine). This provides a mechanism for 
increasing the yield of the heterodimer over other unwanted end-products such as homodirners. 

Bispecific antibodies can be prepared as full length antibodies or antibody fragments (e.g. 
F(ab') 2 bispecific antibodies). Techniques for generating bispecific antibodies from antibody 
fragments have been described in the literature. For example, bispecific antibodies can be 
35 prepared using chemical linkage. Brennan et al., Science 229:81 (1985) describe a procedure 
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wherein intact antibodies are proteolytically cleaved to generate F(ab') 2 fragments. These 
fragments are reduced in the presence of the dithiol complexing agent sodium arsenite to 
stabilize vicinal dithiols and prevent intermolecular disulfide formation. The Fab' fragments 
generated are then converted to thionitrobenzoate (TNB) derivatives. One of the Fab'-TNB 
5 derivatives is then reconverted to the Fab'-thiol by reduction with mercaptoethylamine and is 
mixed with an equimolar amount of the other Fab'-TNB derivative to form the bispecific 
antibody. The bispecific antibodies produced can be used as agents for the selective 
immobilization of enzymes. 

Additionally, Fab' fragments can be directly recovered from E. coli and chemically 
10 coupled to form bispecific antibodies. Shalaby et al., J. Exp. Med. 175:217-225 (1992) describe 
the production of a fully humanized bispecific antibody F(ab'>2 molecule. Each Fab' fragment 
was separately secreted from E. coli and subjected to directed chemical coupling in vitro to form 
the bispecific antibody. The bispecific antibody thus formed was able to bind to cells 
overexposing the ErbB2 receptor and normal human T cells, as well as trigger the lytic activity 
1 5 of human cytotoxic lymphocytes against human breast tumor targets. 

Various techniques for making and isolating bispecific antibody fragments directly from 
recombinant cell culture have also been described. For example, bispecific antibodies have been 
produced using leucine zippers. Kostelny et al., J. Immunol. 148(5):1547-1553 (1992). The 
leucine zipper peptides from the Fos and Jun proteins were linked to the Fab' portions of two 
20 different antibodies by gene fusion. The antibody homodimers were reduced at the hinge region 
to form monomers and then re-oxidized to form the antibody heterodimers. This method can 
also be utilized for the production of antibody homodimers. The "diabody" technology 
described by Hollinger et al., Natl. Acad. Sci. USA 90:6444-6448 (1993) has provided an 
alternative mechanism for making bispecific antibody fragments. The fragments comprise a 
25 heavy-chain variable domain (V H ) connected to a light-chain variable domain (V L ) by a linker 
which is too short to allow pairing between the two domains on the same chain. Accordingly, 
the V H and V L domains of one fragment are forced to pair with the complementary V L and V H 
domains of another fragment, thereby forming two andgen-bmding sites. Another strategy for 
making bispecific antibody fragments by the use of single-chain Fv (sFv) dimers has also been 
30 reported. See, Gruber et al., J. Immunol. 1 52:5368 (1 994). 

Antibodies with more than two valencies are contemplated. For example, trispecific 
antibodies can be prepared. Tutt et al., J. Immunol. 147:60 (1991). 
Exemplary bispecific antibodies can bind to two different epitopes, at least one of which 
originates in the protein antigen of the invention. Alternatively, an anti-antigenic arm of an 
3 5 immunoglobulin molecule can be combined with an arm which binds to a triggering molecule on 
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a leukocyte such as a T-ceU receptor molecule (e.g. CD2, CD3, CD28, or B7), or Fc receptors for 
IgG (Fc R), such as Fc R I (CD64), Fc RII (CD32) and Fc RIII (CD1 6) so as to focus 
cellular defense mechanisms to the cell expressing the particular antigen. Bispecific antibodies 
can also be used to direct cytotoxic agents to cells which express a particular antigen. These 
5 antibodies possess an antigen-binding arm and an arm which binds a cytotoxic agent or a 

radionuclide chelator, such as EOTUBE, DPTA, DOTA, or TETA. Another bispecific antibody 
of interest binds the protein antigen described herein and further binds tissue factor (TF). 

5.13.6 Heteroconjugate Antibodies 

1 0 Heteroconjugate antibodies are also within the scope of the present invention. 

Heteroconjugate antibodies are composed of two covalently joined antibodies. Such antibodies 
have, for example, been proposed to target immune system cells to unwanted cells (U.S. Patent 
No. 4,676,980), and for treatment of HTV infection (WO 91/00360; WO 92/200373; EP 03089). 
It is contemplated that the antibodies can be prepared in vitro using known methods in synthetic 

1 5 protein chemistry, including those involving crosslinking agents. For example, immunotoxins 
can be constructed using a disulfide exchange reaction or by forming a thioether bond. 
Examples of suitable reagents for this purpose include iminothiolate and methyl-4- 
mercaptobutyrimidate and those disclosed, for example, in U.S. Patent No. 4,676,980. 

20 5.13.7 Effector Function Engineering 

It can be desirable to modify the antibody of the invention with respect to effector function, so as 
to enhance, e.g., the effectiveness of the antibody in treating cancer. For example, cysteine 
residue(s) can be introduced into the Fc region, thereby allowing interchain disulfide bond 
formation in this region. The homodimeric antibody thus generated can have improved 

25 internalization capability and/or increased complement-mediated cell killing and antibody- 
dependent cellular cytotoxicity (ADCC). See Caron et al., J. Exp Med., 176: 1191-1 195 (1992) 
and Shopes, J. Immunol, 148: 2918-2922 (1992). Homodimeric antibodies with enhanced anti- 
tumor activity can also be prepared using heterobifunctional cross-linkers as described in Wolff 
et al. Cancer Research, 53: 2560-2565 (1993). Alternatively, an antibody can be engineered that 

30 has dual Fc regions and can thereby have enhanced complement lysis and ADCC capabilities. 
See Stevenson et al., Anti-Cancer Drug Design, 3: 219-230 (1989). 

5.13.8 Immunoconjugates 

The invention also pertains to immunoconjugates comprising an antibody conjugated to a 
3 5 cytotoxic agent such as a chemotherapeutic agent, toxin (e. g. , an enzymatically active toxin of 
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bacterial, fungal, plant, or animal origin, or fragments thereof), or a radioactive isotope a 
radioconjugate). 

Chemotherapeutic agents useful in the generation of such immunoconjugates have been 
described above. Enzymatically active toxins and fragments thereof that can be used include 
diphtheria A chain, nonbinding active fragments of diphtheria toxin, exotoxin A chain (from 
Pseudomonas aeruginosa), ricin A chain, abnn A chain, modeccin A chain, alpha-sarcin, 
Aleurites fordii proteins, dianthin proteins, Phytolaca americana proteins (PAW, PAPII, and 
PAP-S), momordica charantia inhibitor, curcin, crotin, sapaonaria officinalis inhibitor, gelonin, 
mitogellin, restrictocin, phenomycin, enomycin, and the tricothecenes. A variety of 
radionuclides are available for the production of radioconjugated antibodies. Examples include 

2,2 Bi, ,3, 1, 13! In, 9 °Y,and 186 Re. 

Conjugates of the antibody and cytotoxic agent are made using a variety of Afunctional 
protein-coupling agents such as N-succinimidyl-3-(2-pyridyldimiol) propionate (SPDP), 
iminothiolane (IT), Afunctional derivatives of imidoesters (such as dimethyl adipimidate HCL), 
active esters (such as disuccinimidyl suberate), aldehydes (such as glutareldehyde), bis-azido 
compounds (such as bis (p-azidobenzoyl) hexanediamine), bis-diazonium derivatives (such as 
bis-^diazoniumberozoyO-emylenediamine), diisocyanates (such as tolyene 2,6-diisocyanate), 
and bis-active fluorine compounds (such as l,5-difluoro-2,4-dinitrobenzene). For example, a 
ricin immunotoxin can be prepared as described in Vitetta et ah, Science, 238: 1098 (1987). 
Carbon-14-labeled l-isothiocyanatobenzyl-3-methyldiethylene triaminepentaacetic acid (MX- 
DTPA) is an exemplary chelating agent for conjugation of radionucleotide to the antibody. See 
WO94/11026. 

In another embodiment, the antibody can be conjugated to a "receptor" (such 
streptavidin) for utilization in tumor pretargeting wherein the antibody-receptor conjugate is 
administered to the patient, followed by removal of unbound conjugate from the circulation 
using a clearing agent and then administration of a "ligand" {e.g., avidin) that is in turn 
conjugated to a cytotoxic agent 

4.14 COMPUTER READABLE SEQUENCES 

In one application of this embodiment, a nucleotide sequence of the present invention can 
be recorded on computer readable media. As used herein, "computer readable media" refers to 
any medium which can be read and accessed directly by a computer. Such media include, but 
are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and 
magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM 
and ROM; and hybrids of these categories such as magnetic/optical storage media. A skilled 
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artisan can readily appreciate how any of the presently known computer readable mediums can 
be used to create a manufacture comprising computer readable medium having recorded thereon 
a nucleotide sequence of the present invention. As used herein, "recorded" refers to a process for 
storing information on computer readable medium. A skilled artisan can readily adopt any of the 
5 presently known methods for recording information on computer readable medium to generate 
manufactures comprising the nucleotide sequence information of the present invention. 

A variety of data storage structures are available to a skilled artisan for creating a 
computer readable medium having recorded thereon a nucleotide sequence of the present 
invention. The choice of the data storage structure will generally be based on the means chosen 
1 0 to access the stored information. In addition, a variety of data processor programs and formats 
can be used to store the nucleotide sequence information of the present invention on computer 
readable medium. The sequence information can be represented in a word processing text file, 
formatted in commercially-available software such as WordPerfect and Microsoft Word, or 
represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, 
1 5 Oracle, or the like. A skilled artisan can readily adapt any number of data processor structuring 
formats (e.g. text file or database) in order to obtain computer readable medium having recorded 
thereon the nucleotide sequence information of the present invention. 

By providing any of the nucleotide sequences SEQ ID NO:1-1350 or a representative 
fragment thereof; or a nucleotide sequence at least 95% identical to any of the nucleotide 
20 sequences of SEQ ID NO:l-13S0 in computer readable form, a skilled artisan can routinely 
access the sequence information for a variety of purposes. Computer software is publicly 
available which allows a skilled artisan to access sequence information provided in a computer 
readable medium. The examples which follow demonstrate how software which implements the 
BLAST (Altschul et al, J. Mol. Biol. 215:403-410 (1990)) and BLAZE (Brutlag et al., Comp. 
25 Chem. 1 7:203-207 (1 993)) search algorithms on a Sybase system is used to identify open reading 
frames (ORFs) within a nucleic acid sequence. Such ORFs may be protein encoding fragments 
and may be useful in producing commercially important proteins such as enzymes used in 
fermentation reactions and in the production of commercially useful metabolites. 

As used herein, "a computer-based system" refers to the hardware means, software 
30 means, and data storage means used to analyze the nucleotide sequence information of the 

present invention. The minimum hardware means of the computer-based systems of the present 
invention comprises a central processing unit (CPU), input means, output means, and data 
storage means. A skilled artisan can readily appreciate that any one of the currently available 
computer-based systems are suitable for use in the present invention. As stated above, the 
35 computer-based systems of the present invention comprise a data storage means having stored 
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therein a nucleotide sequence of the present invention and the necessary hardware means and 
software means for supporting and implementing a search means. As used herein, "data storage 
means" refers to memory which can store nucleotide sequence information of the present 
invention, or a memory access means which can access manufactures having recorded thereon 
the nucleotide sequence information of the present invention. 

As used herein, "search means" refers to one or more programs which are implemented 
on the computer-based system to compare a target sequence or target structural motif with the 
sequence information stored within the data storage means. Search means are used to identify 
fragments or regions of a known sequence which match a particular target sequence or target 
motif. A variety of known algorithms are disclosed publicly and a variety of commercially 
available software for conducting search means are and can be used in the computer-based 
systems of the present invention. Examples of such software includes, but is not limited to, 
Smith-Waterman, MacPattern (EMBL), BLASTN and BLASTA (NPOLYPEPTIDEIA). A 
skilled artisan can readily recognize that any one of the available algorithms or implementing 
software packages for conducting homology searches can be adapted for use in the present 
computer-based systems. As used herein, a "target sequence" can be any nucleic acid or amino 
acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can 
readily recognize that the longer a target sequence is, the less likely a target sequence will be 
present as a random occurrence in the database. The most preferred sequence length of a target 
sequence is from about 10 to 300 amino acids, more preferably from about 30 to 100 nucleotide 
residues. However, it is well recognized that searches for commercially important fragments, 
such as sequence fragments involved in gene expression and protein processing, may be of 
shorter length. 

As used herein, "a target structural motif," or "target motif," refers to any rationally 
selected sequence or combination of sequences in which the sequence^) are chosen based on a 
three-dimensional configuration which is formed upon the folding of the target motif. There are 
a variety of target motifs known in the art. Protein target motifs include, but arc not limited to, 
enzyme active sites and signal sequences. Nucleic acid target motifs include, but are not limited 
to, promoter sequences, hairpin structures and inducible expression elements (protein binding 
sequences). 

4.15 TRIPLE HELIX FORMATION 

In addition, the fragments of the present invention, as broadly described, can be used to 
control gene expression through triple helix formation or antisense DNA or RNA, both of which 
methods are based on the binding of a polynucleotide sequence to DNA or RNA. 
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Polynucleotides suitable for use in these methods are preferably 20 to 40 bases in length and are 
designed to be complementary to a region of the gene involved in transcription (triple helix - see 
Lee et al., Nucl. Acids Res. 6:3073 (1979); Cooney et al., Science 15241:456 (1988); and Dervan 
et al., Science 251 : 1 360 (1991)) or to the mRNA itself (antisense - Olmno, J. Neurochem. 

5 56:560 (1 99 1); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, 
Boca Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of RN A 
transcription from DNA, while antisense RNA hybridization blocks translation of an mRNA 
molecule into polypeptide. Both techniques have been demonstrated to be effective in model 
systems. Information contained in the sequences of the present invention is necessary for the 

1 0 design of an antisense or triple helix oligonucleotide. 

4.16 DIAGNOSTIC ASSAYS AND KITS 

The present invention further provides methods to identify the presence or expression of 
one of the ORFs of the present invention, or homolog thereof, in a test sample, using a nucleic 
15 acid probe or antibodies of the present invention, optionally conjugated or otherwise associated 

with a suitable label. 

In general, methods for detecting a polynucleotide of the invention can comprise 
contacting a sample with a compound that binds to and forms a complex with the polynucleotide 
for a period sufficient to form the complex, and detecting the complex, so that if a complex is 
20 detected, a polynucleotide of the invention is detected in the sample. Such methods can also 

comprise contacting a sample under stringent hybridization conditions with nucleic acid primers 
that anneal to a polynucleotide of the invention under such conditions, and amplifying annealed 
polynucleotides, so that if a polynucleotide is amplified, a polynucleotide of the invention is 
detected in the sample. 

25 In general, methods for detecting a polypeptide of the invention can comprise contacting 

a sample with a compound that binds to and forms a complex with the polypeptide for a period 
sufficient to form the complex, and detecting the complex, so that if a complex is detected, a 
polypeptide of the invention is detected in the sample. 

In detail, such methods comprise incubating a test sample with one or more of the 
30 antibodies or one or more of the nucleic acid probes of the present invention and assaying for 
binding of the nucleic acid probes or antibodies to components within the test sample. 

Conditions for incubating a nucleic acid probe or antibody with a test sample vary. 
Incubation conditions depend on the format employed in the assay, the detection methods 
employed, and the type and nature of the nucleic acid probe or antibody used in the assay. One 
3 5 skilled in the art will recognize that any one of the commonly available hybridization, 
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amplification or immunological assay formats can readily be adapted to employ the nucleic acid 
probes or antibodies of the present invention. Examples of such assays can be found in Chard, 
T., An Introduction to Radioimmunoassay and Related Techniques, Elsevier Science Publishers, 
Amsterdam, The Netherlands (1986); Bullock, G.R. et al., Techniques in Immunocytochemistry, 
5 Academic Press, Orlando, FL Vol. 1 (1982), Vol. 2 (1983), Vol. 3 (1985); Tijssen, P., Practice 
and Theory of immunoassays: Laboratory Techniques in Biochemistry and Molecular Biology, 
Elsevier Science Publishers, Amsterdam, The Netherlands (1985). The test samples of the 
present invention include cells, protein or membrane extracts of cells, or biological fluids such as 
sputum, blood, serum, plasma, or urine. The test sample used in the above-described method 
10 will vary based on the assay format, nature of the detection method and the tissues, cells or 

extracts used as the sample to be assayed. Methods for preparing protein extracts or membrane 
extracts of cells are well known in the art and can be readily be adapted in order to obtain a 
sample which is compatible with the system utilized. 

In another embodiment of the present invention, kits are provided which contain the 
1 5 necessary reagents to carry out the assays of the present invention. Specifically, the invention 
provides a compartment kit to receive, in close confinement, one or more containers which 
comprises: (a) a first container comprising one of the probes or antibodies of the present 
invention; and (b) one or more other containers comprising one or more of the following: wash 
reagents, reagents capable of detecting presence of abound probe or antibody. 
20 In detail, a compartment kit includes any kit in which reagents are contained in separate 

containers. Such containers include small glass containers, plastic containers or strips of plastic 
or paper. Such containers allows one to efficiently transfer reagents from one compartment to 
another compartment such that the samples and reagents are not cross-omtaminated, and the 
agents or solutions of each container can be added in a quantitative fashion from one 
25 compartment to another. Such containers will include a container which will accept the test 
sample, a container which contains the antibodies used in the assay, containers which contain 
wash reagents (such as phosphate buffered saline, Tris-buffers, etc.), and containers which 
contain the reagents used to detect the bound antibody or probe. Types of detection reagents 
include labeled nucleic acid probes, labeled secondary antibodies, or in the alternative, if the 
30 primary antibody is labeled, the enzymatic, or antibody binding reagents which are capable of 
reacting with the labeled antibody. One skilled in the art will readily recognize that the disclosed 
probes and antibodies of the present invention can be readily incorporated into one of the 
established kit formats which are well known in the art. 



35 4.17 MEDICAL IMAGING 
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The novel polypeptides and binding partners of the invention are useful in medical 
imaging of sites expressing the molecules of the invention (e.g., where the polypeptide of the 
invention is involved in the immune response, for imaging sites of inflammation or infection). 
See, e.g., Kunkel et al., U.S. Pat. NO. 5,413,778. Such methods involve chemical attachment of 
a labeling or imaging agent, administration of the labeled polypeptide to a subject in a 
pharmaceutically acceptable carrier, and imaging the labeled polypeptide in vivo at the target 
site. 



4.18 SCREENING ASSAYS 
10 Using the isolated proteins and polynucleotides of the invention, the present invention 

further provides methods of obtaining and identifying agents which bind to a polypeptide 
encoded by an ORF corresponding to any of the nucleotide sequences set forth in SEQ ID NO:l- 
1350, or bind to a specific domain of the polypeptide encoded by the nucleic acid. In detail, said 

method comprises the steps of: 
1 5 ( a ) contacting an agent with an isolated protein encoded by an ORF of the present 

invention, or nucleic acid of the invention; and 

(b) determining whether the agent binds to said protein or said nucleic acid. 
In general, therefore, such methods for identifying compounds that bind to a 
polynucleotide of the invention can comprise contacting a compound with a polynucleotide of 
20 the invention for a time sufficient to form a polynucleotide/compound complex, and detecting 
the complex, so that if a polynucleotide/compound complex is detected, a compound that binds 
to a polynucleotide of the invention is identified. 

Likewise, in general, therefore, such methods for identifying compounds that bind to a 
polypeptide of the invention can comprise contacting a compound with a polypeptide of the 
25 invention for a time sufficient to form a polypeptide/compound complex, and detecting the 
complex, so that if a polypeptide/compound complex is detected, a compound that binds to a 
polynucleotide of the invention is identified. 

Methods for identifying compounds that bind to a polypeptide of the invention can also 
comprise contacting a compound with a polypeptide of the invention in a cell for a time 
30 sufficient to form a polypeptide/compound complex, wherein the complex drives expression of a 
receptor gene sequence in the cell, and detecting the complex by detecting reporter gene 
sequence expression, so that if a polypeptide/compound complex is detected, a compound that 
binds a polypeptide of the invention is identified. 

Compounds identified via such methods can include compounds which modulate the 
35 activity of a polypeptide of the invention (that is, increase or decrease its activity, relative to 
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activity observed in the absence of the compound). Alternatively, compounds identified via such 
methods can include compounds which modulate the expression of a polynucleotide of the 
invention (that is, increase or decrease expression relative to expression levels observed in the 
absence of the compound). Compounds, such as compounds identified via the methods of the 
invention, can be tested using standard assays well known to those of skill in the art for their 
ability to modulate activity/expression. 

The agents screened in the above assay can be, but are not limited to, peptides, 
carbohydrates, vitamin derivatives, or other pharmaceutical agents. The agents can be selected 
and screened at random or rationally selected or designed using protein modeling techniques. 

For random screening, agents such as peptides, carbohydrates, pharmaceutical agents and 
the like are selected at random and are assayed for their ability to bind to the protein encoded by 
the ORF of the present invention. Alternatively, agents may be rationally selected or designed. 
As used herein, an agent is said to be "rationally selected or designed" when the agent is chosen 
based on the configuration of the particular protein. For example, one skilled in the art can 
1 5 readily adapt currently available procedures to generate peptides, pharmaceutical agents and the 
like, capable of binding to a specific peptide sequence, in order to generate rationally designed 
antipeptide peptides, for example see Hurby et al., Application of Synthetic Peptides: Antisense 
Peptides," In Synthetic Peptides, A User's Guide, W.H. Freeman, NY (1992), pp. 289-307, and 
Kaspczak et al., Biochemistry 28:9230-8 (1989), or pharmaceutical agents, or the like. 
20 In addition to the foregoing, one class of agents of the present invention, as broadly 

described, can be used to control gene expression through binding to one of the ORFs or EMFs 
of the present invention. As described above, such agents can be randomly screened or 
rationally designed/selected. Targeting the ORF or EMF allows a skilled artisan to design 
sequence specific or element specific agents, modulating the expression of either a single ORF or 
25 multiple ORFs which rely on the same EMF for expression control . One class of DN A binding 
agents are agents which contain base residues which hybridize or form a triple helix formation 
by binding to DNA or RNA. Such agents can be based on the classic phosphodiester, 
ribonucleic acid backbone, or can be a variety of sulfhydryl or polymeric derivatives which have 
base attachment capacity. 
3 0 Agents suitable for use in these methods preferably contain 20 to 40 bases and are 

designed to be complementary to a region of the gene involved in transcription (triple helix - see 
Lee et al., Nucl. Acids Res. 6:3073 (1979); Cooney et al., Science 241 :456 (1988); and Dervan et 
al., Science 251:1360 (1991)) or to the mRNA itself (antisense - Okano, J. Neurochem. 56:560 
(1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 
35 Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of RNA transcription 
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from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into 
polypeptide. Both techniques have been demonstrated to be effective in model systems. 
Information contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide and other DNA binding agents. 
5 Agents which bind to a protein encoded by one of the ORFs of the present invention can 

be used as a diagnostic agent. Agents which bind to a protein encoded by one of the ORFs of the 
present invention can be formulated using known techniques to generate a pharmaceutical 
composition. 

10 4.19 USE OF NUCLEIC ACIDS AS PROBES 

Another aspect of the subject invention is to provide for polypeptide-specific nucleic acid 
hybridization probes capable of hybridizing with naturally occurring nucleotide sequences. The 
hybridization probes of the subject invention may be derived from any of the nucleotide 
sequences SEQ ID NO:M350. Because the corresponding gene is only expressed in a limited 
15 number of tissues, a hybridization probe derived from of any of the nucleotide sequences SEQ 
ID NO: 1-1350 can be used as an indicator of the presence of RNA of cell type of such a tissue in 
a sample. 

Any suitable hybridization technique can be employed, such as, for example, in situ 
hybridization. PCR as described in US Patents Nos. 4,683,195 and 4,965,188 provides 

20 additional uses for oligonucleotides based upon the nucleotide sequences. Such probes used in 
PCR may be of recombinant origin, may be chemically synthesized, or a mixture of both. The 
probe will comprise a discrete nucleotide sequence for the detection of identical sequences or a 
degenerate pool of possible sequences for identification of closely related genomic sequences. 
Other means for producing specific hybridization probes for nucleic acids include the 

25 cloning of nucleic acid sequences into vectors for the production of mRNA probes. Such vectors 
are known in the art and are commercially available and may be used to synthesize RNA probes 
in vitro by means of the addition of the appropriate RNA polymerase as T7 or SP6 RNA 
polymerase and the appropriate radioactively labeled nucleotides. The nucleotide sequences may 
be used to construct hybridization probes for mapping their respective genomic sequences. The 

30 nucleotide sequence provided herein may be mapped to a chromosome or specific regions of a 
chromosome using well known genetic and/or chromosomal mapping techniques. These 
techniques include in situ hybridization, linkage analysis against known chromosomal markers, 
hybridization screening with libraries or flow-sorted chromosomal preparations specific to 
known chromosomes, and the like. The technique of fluorescent in situ hybridization of 
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chromosome spreads has been described, among other places, in Verma et al (1988) Human 
Chromosomes: A Manual of Basic Techniques, Pergamon Press, New York NY. 

Fluorescent in situ hybridization of chromosomal preparations and other physical 
chromosome mapping techniques may be correlated with additional genetic map data. Examples 
5 of genetic map data can be found in the 1994 Genome Issue of Science (265:1981f). Correlation 
between the location of a nucleic acid on a physical chromosomal map and a specific disease (or 
predisposition to a specific disease) may help delimit the region of DNA associated with that 
genetic disease. The nucleotide sequences of the subject invention may be used to detect 
differences in gene sequences between normal, carrier or affected individuals. 

10 4 20 PREPARATION OF SUPPORT BOUND OLIGONUCLEOTIDES 

Oligonucleotides, i.e., small nucleic acid segments, may be readily prepared by, for 
example, directly synthesizing the oligonucleotide by chemical means, as is commonly practiced 
using an automated oligonucleotide synthesizer. 

Support bound oligonucleotides may be prepared by any of the methods known to those of 

15 skill in the art using any suitable support such as glass, polystyrene or Teflon. One strategy is to 
precisely spot oligonucleotides synthesized by standard synthesizers. Immobilization can be 
achieved using passive adsorption (Inouye & Hondo, (1 990) J. Clin. Microbiol. 28(6) 1469-72); 
using UV light (Nagata et al, 1985; Dahlen et al., 1987; Morrissey & Collins, (1989) Mol. Cell 
Probes 3(2) 1 89-207) or by covalent binding of base modified DNA (Keller et al, 1988; 1989); all 

20 references being specifically incorporated herein. 

Another strategy that may be employed is the use of the strong biotm-streptavidin 
interaction as a linker. For example, Broude et al. (1 994) Proc. Natl. Acad. Sci. USA 91(8) 3072-6, 
describe the use of biotinylated probes, although these are duplex probes, that are immobilized on 
streptavidin-coatedmagneticbeads. Streptavidin-coated beads may be purchased from Dynal, 

25 Oslo. Of course, this same linking chemistry is applicable to coating any surface with streptavidin. 
Biotinylated probes may be purchased from various sources, such as, e.g. , Operon Technologies 
(Alameda, CA). , 

Nunc Laboratories (Naperville, IL) is also selling suitable material that could be used. Nunc 
Laboratories have developed a method by whichDNA can be covalently bound to the microwell 
30 surface termed CovalinkNH. CovaLinkNH is a polystyrene surface grafted with secondary amino 
groups (>NH) that serve as bridge-heads for further covalent coupling. CovaLink Modules may be 
purchased from Nunc Laboratories. DNA molecules may be bound to CovaLink exclusively at the 
5'-end by a phosphoramidatebond, allowing immobilization of more than 1 pmol of DNA 
(Rasmussene/a/., (1991) Anal. Biochem. 198(1) 138-42). 
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The use of CovaLinkNH strips for covalent binding of DNA molecules at the 5'-end has 
been described (Rasmussenet al., (1991). In this technology, a phosphoramidatebond is employed 
(Chu et al., (1 983) Nucleic Acids Res. 1 1(8) 6513-29). This is beneficial as immobilization using 
only a single covalent bond is preferred. The phosphoramidatebond joins the DNA to the 

5 CovaLink NH secondary amino groups that are positioned at the end of spacer arms covalently 
grafted onto the polystyrene surface through a 2 nm long spacer arm. To link an oligonucleotide to 
CovaLink NH via an phosphoramidate bond, the oligonucleotide terminus must have a 5'-end 
phosphate group. It is, perhaps, even possible for biotin to be covalently bound to CovaLink and 
then streptavidinused to bind the probes. 

10 More specifically, the linkage method includes dissolving DNA in water (7.5 ng/ul) and 

denaturing for 10 min. at 95°C and cooling on ice for 10 min. Ice-cold 0.1 M 1-methylimidazole, 
pH 7.0 (1-Melm 7 ), is then added to a final concentration of 10 mM 1-Melm 7 . A ss DNA solution is 
then dispensed into CovaLink NH strips (75 ul/well) standing on ice. 

Carbodiimide0.2 M l-emyl-3<3-dimemylammopropyl)-carbodiimide(EDC), dissolved in 

15 1 0 mM 1 -Melm 7 , is made fresh and 25 ul added per well. The strips are incubated for 5 hours at 
50°C. After incubation the strips are washed using, e.g. , Nunc-Immuno Wash; first the wells are 
washed 3 times, then they are soaked with washing solution for 5 min., and finally they are washed 
3 times (where in the washing solution is 0.4 N NaOH, 0.25% SDS heated to 50°C). 

It is contemplated that a further suitable method for use with the present invention is that 

20 described in PCT Patent Application WO 90/033 82 (Southern & Maskos), incorporated herein by 
reference. This method of preparing an oligonucleotide bound to a support involves attaching a 
nucleoside 3'-reagent through the phosphate group by a covalent phosphodiester link to aliphatic 
hydroxyl groups carried by the support. The oligonucleotide is then synthesized on the supported 
nucleoside and protecting groups removed from the synthetic oligonucleotide chain under standard 

25 conditions that do not cleave the oligonucleotide from the support. Suitable reagents include 
nucleoside phosphoramidite and nucleoside hydrogen phosphorate. 

An on-chip strategy for the preparation of DNA probe for the preparation of DNA probe 
arrays may be employed. For example, addressable laser-activated photodeprotection may be 
employed in the chemical synthesis of oligonucleotides directly on a glass surface, as described by 

30 Fodore/a/. (1991) Science 251(4995) 767-73, incorporated herein by reference. Probes may also 
be immobilized on nylon supports as described by Van Ness et al. (1 99 1 ) Nucleic Acids Res. 
1 9(12) 3345-50; or linked to Teflon using the method of Duncan & Cavalier (1 988) Anal. Biochem. 
1 69(1 ) 1 04-8; all references being specifically incorporated herein. 
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To link an oligonucleotide to a nylon support, as described by Van Ness et al (1991), 
requires activation of the nylon surface via alkylation and selective activation of the 5'-amine of 
oligonucleotides with cyanuric chloride. 

One particular way to prepare support bound oligonucleotides is to utilize the 

5 light-generated synthesis described by Pease et al, (1994) PNAS USA 91(1 1) 5022-6, incorporated 
herein by reference). These authors used current photolithographic techniques to generate arrays of 
immobilized oligonucleotide probes (DNA chips). These methods, in which light is used to direct 
the synthesis of oligonucleotide probes in high-density, miniaturized arrays, utilize photolabile 
5'-protected A^acyl-deoxynucleosidephosphoramidites, surface linker chemistry and versatile 

10 combinatorial synthesis strategies. A matrix of 256 spatially defined oligonucleotide probes may be 
generated in this manner. 

4.21 PREPARATION OF NUCLEIC ACID FRAGMENTS 

The nucleic acids may be obtained from any appropriate source, such as cDNAs, genomic 
DNA, chromosomal DNA, microdissected chromosome bands, cosmid or YAC inserts, and RNA, 
15 mcludmgrnRNAwithoutanyamplificationsteps. For example, Sambrook etal. (1989) describes 
three protocols for the isolation of high molecular weight DNA from mammalian cells (p. 
9.14-9.23). 

DNA fragments may be prepared as clones in Ml 3, plasmid or lambda vectors and/or 
prepared directly from genomic DNA or cDNA by PCR or other amplification methods. Samples 

20 may be prepared or dispensed in multiwell plates. About 100-1 000 ng of DNA samples may be 
prepared in 2-500 ml of final volume. 

The nucleic acids would then be fragmented by any of the methods known to those of skill 
in the art including, for example, using restriction enzymes as described at 924-9.28 of Sambrook et 
al (1 989), shearing by ultrasound and NaOH treatment 

25 Low pressure shearing is also appropriate, as described by Schriefer et al (1990)Nucleic 

Acids Res. 18(24) 7455-6, incorporated herein by reference). In this method, DNA samples are 
passed through a small French pressure cell at a variety of low to intermediate pressures. A lever 
device allows controlled application of low to intermediate pressures to the cell. The results of 
these studies indicate that low-pressure shearing is a useful alternative to sonic and enzymatic DNA 

30 fragmentation methods. 

One particularly suitable way for fragmenting DNA is contemplated to be that using the two 
base recognition endonuclease, Cv,JI, described by Fitzgerald et al (1 992) Nucleic Acids Res. 
20(14) 3753-62. These authors described an approach for the rapid fragmentation and fractionation 
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of DNA into particular sizes that they contemplated to be suitable for shotgun cloning and 
sequencing. 

The restriction endonuclease Cv/JI normally cleaves the recognition sequence PuGCPy 
between the G and C to leave blunt ends. Atypical reaction conditions, which alter the specificity of 
this enzyme (Cv/JI**), yield a quasi-random distribution of DNA fragments form the small 
molecule pUC19 (2688 base pairs). Fitzgerald et al. (1992) quantitatively evaluated the 
randomness of this fragmentation strategy, using a CviJl** digest of pUC19 that was size 
fractionated by a rapid gel filtration method and directly ligated, without end repair, to a lac Z minus 
Ml 3 cloning vector. Sequence analysis of 76 clones showed that Cv/JI* * restricts pyGCPy and 
PuGCPu, in addition to PuGCPy sites, and that new sequence data is accumulated at a rate 
consistent with random fragmentation. 

As reported in the literature, advantages of this approach compared to sonication and 
agarose gel fractionation include: smaller amounts of DNA are required (0.2-0.5 ug instead of 2-5 
ug); and fewer steps are involved (no preligation, end repair, chemical extraction, or agarose gel 
electrophoresis and elution are needed 

Irrespective of the manner in which the nucleic acid fragments are obtained or prepared, it is 
important to denature the DNA to give single stranded pieces available for hybridization. This is 
achieved by incubating the DNA solution for 2-5 minutes at 80-90°C. The solution is then cooled 
quickly to 2°C to prevent renaturationof the DNA fragments before they are contacted with the 
chip. Phosphate groups must also be removed from genomic DNA by methods known in the art. 

422 PREPARATION OF DNA ARRAYS 

Arrays may be prepared by spotting DNA samples on a support such as a nylon membrane. 
Spotting may be performed by using arrays of metal pins (the positions of which correspond to an 
array of wells in a microtiter plate) to repeated by transfer of about 20 nl of a DNA solution to a 
nylon membrane. By offset printing, a density of dots higher than the density of the wells is 
achieved. One to 25 dots may be accommodated in 1 mm 2 , depending on the type of label used. By 
avoiding spotting in some preselected number of rows and columns, separate subsets (subarrays) 
may be formed. Samples in one subarray may be the same genomic segment of DNA (or the same 
gene) from different individuals, or may be different, overlapped genomic clones. Each of the 
subarrays may represent replica spotting of the same samples. In one example, a selected gene 
segment may be amplified from 64 patients. For each patient, the amplified gene segment may be in 
one 96-well plate (all 96 wells containing the same sample). A plate for each of the 64 patients is 
prepared. By using a 96-pin device, all samples may be spotted on one 8 x 12 cm membrane. 
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Subairays may contain 64 samples, one from each patient. Where the 96 subarrays are identical, the 
dot span may be 1 mm 2 and there may be a 1 mm space between subarrays. 

Another approach is to use membranes or plates (available fromNUNC, Naperville, Illinois) 
which may be partitioned by physical spacers e.g. a plastic grid molded over the membrane, the grid 
being similar to the sort of membrane applied to the bottom of multiwell plates, or hydrophobic 
strips. A fixed physical spacer is not preferred for imaging by exposure to flat phosphor-storage 
screens or x-ray films. 

The present invention is illustrated in the following examples. Upon consideration of the 
present disclosure, one of skill in the art will appreciate that many other embodiments and variations 
may be made in the scope of the present invention. Accordingly, it is intended that the broader 
aspects of the present invention not be limited to the disclosure of the following examples. The 
present invention is not to be limited in scope by the exemplified embodiments which are intended 
as illustrations of single aspects of the invention, and compositions and methods which are 
functionally equivalent are within the scope of the invention. Indeed, numerous modifications and 
variations in the practice of the invention are expected to occur to those skilled in the art upon 
considerationof the present preferred embodiments. Consequently, the only limitations which 
should be placed upon the scope of the invention are those which appear in the appended claims. 

All references cited within the body of the instant specification are hereby incorporated by 
reference in their entirety. 

5.0 EXAMPLES 

5.1 EXAMPLE 1 

Novel Nucleic Acid Sequences Obtained Fr om Various Libraries 

A plurality of novel nucleic acids were obtained from cDNA libraries prepared from various 
human tissues and in some cases isolated from a genomic library derived from human chromosome 
using standard PCR, SBH sequence signature analysis and Sanger sequencing techniques. The 
inserts of the library were amplified with PCR using primers specific for the vector sequences 
which flank the inserts. Clones from cDNA libraries were spotted on nylon membrane filters and 
screened with oligonucleotide probes (e.g. , 7-mers) to obtain signature sequences. The clones were 
clustered into groups of similar or identical sequences. Representative clones were selected for 
sequencing. 

In some cases, the 5 1 sequence of the amplified inserts was then deduced using a typical 
Sanger sequencing protocol. PCR products were purified and subjected to fluorescent dye 
terminator cycle sequencing. Single pass gel sequencing was done using a 377 Applied Biosystems 
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(ABI) sequencer to obtain the novel nucleic acid sequences. In some cases RACE (Random 
Amplification of cDNA Ends) was performed to further extend the sequence in the 5' direction. 



5.2 EXAMPLE 2 

Novel Coptics 

The novel contigs of the invention were assembled from sequences that were obtained from 
a cDNA library by methods described in Example 1 above, and in some cases sequences obtained 
from one or more public databases. The sequences for the resulting nucleic acid contigs are 
designated as SEQ ID NO: 1 -1 350 and are provided in the attached Sequence Listing. The contigs 
were assembled using an EST sequence as a seed. Then a recursive algorithm was used to extend 
the seed EST into an extended assemblage, by pulling additional sequences from different databases 
(i.e., Hyseq's database containing EST sequences, dbEST version 1 14, gb pri 114, and UniGene 
version 1 0 1) that belong to this assemblage. The algorithm terminated when there was no 
additional sequences from the above databases that would extend the assemblage. Inclusion of 
component sequences into the assemblage was based on a BLASTN hit to the extending 
assemblage with BLAST score greater than 300 and percent identity greater than 95%. 

Table 3 sets forth the novel predicted polypeptides (including proteins) encoded by the 
novel polynucleotides (SEQ ID NO: 1 89-282) of the present invention, and their corresponding 
nucleotidelocationstoeachofSEQIDNO: 189-282. Table 3 also indicates the method by which 
the polypeptide was predicted. Method A refers to a polypeptide obtained by using a software 
program called FASTY (available from htt p://fasta.bioch. virginia.edu) which selects a polypeptide 
based on a comparison of the translated novel polynucleotide to known polynucleotides (W.R. 
Pearson, Methods in Enzymology, 183:63-98 (1990), herein incorporated by reference). MethodB 
refers to a polypeptide obtained by using a software program called GenScan for human/vertebrate 
sequences (available from Stanford University, Office of Technology Licensing) that predicts the 
polypeptide based on a probabilistic model of gene structure/compositionalproperties (C. Burge 
and S. Karlin, J. Mol. Biol., 268:78-94 (1997), incorporated herein by reference). Method C refers 
to a polypeptide obtained by using a Hyseq proprietary software program that translates the novel 
polynucleotide and its complementary strand into six possible amino acid sequences (forward and 
reverse frames) and chooses the polypeptide with the longest open reading frame. 

The nearest neighbor results for SEQ ID NO: 1-1350 were obtained by a BLASTP 
version 2.0al 19MP-WashU search against Genpept release 120 and Geneseq database October 
12, 2000, update 21 (Derwent), using BLAST algorithm. The nearest neighbor result showed the 
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closest homologue for SEQ ID NO:1-1350. The nearest neighbor results for SEQ ID NO: 1- 
1 350 are shown in Table 2 below. 

Tables 1, 2 and 3 follow. Table 1 shows the various tissue sources of SEQ ID NO: 1-1350. 
Table 2 shows the nearest neighbor result for the assembled contig. The nearest neighbor result 
shows the closest homolog with an identifiable function for each assemblage. Table 3 contains the 
start and stop nucleotides for the translated amino acid sequence for which each assemblage 
encodes. Table 3 also provides a correlation between the amino acid sequences set forth in the 
Sequence Listing, the nucleotide sequences set forth in the Sequence Listing and the SEQ ID NO. in 
USSN 09/496,914. 
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TABLE 1 



Tissue Origin 


RNA Source 


Hyseq Library Name 


SEQ ID NOS: 


adult brain 


VJLDV/W 


AB3001 


"ill 151 188 215 662-665 877 910927 
976 1233 1319 


adult brain 


GIBCO 


ABD003 


41 49 74 101 111 120 132 141-142 151 
217 225 238 271 317 404 446 469 503 
513-514 535 550 564 573 666-669 798 
898 910 927 976 1067 1083 1085 1178 
1254 


adult brain 


Clontech 


ABR001 


39 216 238 327 356 535 927 1056 1121 
1178-1180 1199 1251 


adult brain 


Clontech 


ABR006 


74 611 949 1034 1136 


adult brain 


Clontech 


ABR008 


14 32 41 61 81 86 89 120 132 138 145 
147 1 88 197 208 225 227-239 250 300- 
303 312 316 328-331 340 357-362 374 
380 384-391 408 414 446 448 464-467 
483 488 495-496 505 512 521 535 550 
$66 sii S8S 590 594 598 634 641 
ccq ££/c 68^ 725 742 764 767 786 801 
805 810 RT\ 826 829 83 1 836 841 887- 
923 927 934 943 950-95 1 963 976 995 
1000-1001 10061026 1034 1048 1057- 
1067 1086 1088 1090 1118 1120 1122- 
1128 1142 1162 1181-1192 1199 1204 
1218-1219 1225 1232 1253 1267 1271- 
3306 1342 1347 1349-1350 


adult brain 


Clontech 


ABR011 


49 238 1219 


adult brain 


BioChain 


ABR012 


74 238 


adult brain 


Invitrogen 


ABR013 


868 1268 


adult brain 


Invitrogen 


ABT004 


49 1 17 138 191 217 252 291 305 535 
566 596 663 670 746 798 816-819 876 
892 898 922 943 963 1034-1036 1 121 


cultured 
preadipocytes 


Strategene 


ADP001 


41 74 101 138 21 1 238 304 537 582 
740 798 883 943 976 1067 


adrenal gland 


Clontech 


ADR002 


49 74 101 111 120 127 151215 238 
240-247 3 16 330 363-364 404 414 534- 
535 833 924-940 950 963 976 1001 
1003 1067-10701118 1156 1193-1200 
1325 


adult heart 


GIBCO 


AHR001 


38 49 71-72 74-77 79 92 99 101 1 1 1 
1 18 129 132 138 151 158-163 182 195- 
203 215 217 238 264 269 353 384 398 
408 434-439 446 504 512-513 519 537 
562-573 577 611-614 616-619 658 661 
671-672 722 734 757-773 815 828-835 
874 891 898 919 926-927 976 988 
1021 1037 1041 1062 1067 1071 1080 
1083 1093 1122 1131 1185 1201 1254 
1308 1331 1335 


adult kidney 


GIBCO 


AKD001 


41 49 51 71-74 78-85 94 100-101 103- 
107 111 119-120 138 151 157215217- 
21 8 238 250 264 294 304 384 404 440 
446 454 477 504-505 509 514 518-519 
535 537 564 574-583 620-627 639 653 
673-675 705 753 789 831 844 851 859 
877 909 918 927 956 963 976 1067 
1074 1083 1095 1178 1302 1331 1335 


adult kidney 


Invitrogen 


AKT002 


11-12 4149 111-112 215-217294 316 
446 487 564 575 844 868 910 927 976 
1116 


adult lung i GIBCO 


ALG001 


8 101 111 151 187 402 446 490 514 
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Tissue Origin J 


R>JA Source 


Hyseq Library Name 


SEQ ID NOS: 

5 1 8 537 545 549 580 5oZ d^i jVH oj*+ 
640 651-652 676-678 725 851 873 918 
952 976 1042 1067 1076 1083 1152 


lymph node 


Clontech 


ALN001 


8 111 121 151 180-182 188 215 537 
545 549 651 o/y-ooz 107 bih-oiu 000 
873 927 952 976 1042 1059 1335 


young liver 


GIBCO 


ALV001 


8 64 79 111 186 215-216238 446 514 
519 537 564 653 683-684 698 753 798 
813 833 840 858 927 976 1038-1039 
1051 1085 1224 1245 1256 


adult liver 


Invitrogen 


ALV002 
ALV003 


40 71 292-293 305 384 468-469 496 
505 657 675 714 753 832 844 941-942 
976 1040 1076 1256 1293 
976 


adult liver 
adult ovary 


Clontech 
Invitrogen 


AOV001 


" 8 32 36 38 41 49 51 71 74 79-80 101 
104 111 120 122-125 138 140 143-149 
151 188-190 207-212 215-217 238 264 
3 1 6 384 409 440 445-446 496 504 5 12 
514 518-519 535 537 549-550 564 566 
571 580 582 600 618 638 657 667 681 
685-697 699 705 722 735-744 761 771 
815 833 842-865 868 875-876 918 9ZO- 
927 950 952 963 976 1023 1042 1048 
1051 1059 1072 1076 1083 1117 1120 
1124 1131 1144 1174 1224 1268 1331 


adult placenta 


Clontech 


APL001 


1335 

102 217 238 537 641 700 


placenta 


Invitrogen 
GIBCO 


APL002 
ASPOOl 


663 851 1048 

8 45 74 111 132 140 151 185 217 238 
294 414 446 477 504 5 14 534 545 549 
592 722 873 883 952 976 1041-1042 
1083 1093-1094 1152 1224 


testis 


GIBCO 


ATS001 


"72 107 111 113 126 140 151 183215 
238 446 497 537 642 701-706 811 877 
927 962 976 1083 1117 1131 


adult bladder 


Invitrogen 


BLD001 


'41 151 191 402-405 409 414 496 545 
592 607 706 873 952 1178 1329-1335 


bone marrow 


Clontech 


BMD001 


' 8 58-62 65-68 74 79 108 111 116 137 
147 151 164-174 213-215 238 305-307 
374 404 446 460 466 5 16 519 534 538- 
541 544-546 549-554 566 584 586 592 
596 607 610 628-629 643-645 652 707- 
708 774-789 844 866-871 873 919 927 
952 963 976 998 1034 1042 1064 1083 
1085 1120 1132 1152 1225 1229 1268 
1307 1310 


bone marrow 


Clontech 


BMD002 
BMD004 


" 6 8 37-38 52 74 77 105 111 1Z* 132 
210 317 510-511 545 549 581 598 628 
638 724 766 789 844 860 868 873 919 
927 952 963 968 976 104z ill J 1 1^1 
1160-1161 1229 1266 1346 

" 111 238 282 549 1083 


bone marrow 
adult colon 


Clontech 
Invitrogen 


CLN001 


52 260 264 299 494 536 545 564 592 
844 873 877 952 976 1042 1152 1268 
1336-1337 


adult cervix 
| diaphragm 


BioChain 
BioChain 


C V AUU1 

DIA002 


" 49 51 129 132 151 205 207 238 332- 
335 365-367 392-401 440 466 470-471 
518 537 597 629 832 877 927 976 1006 
1085 1 1 17 1 129-1 134 1 192 1202-1205 
1219 1309-1328 
74 976 1083 



100 



WO 01/57188 



PCT/U SO 1/03800 



i issue v^/rigm 
endothelial cells 


RNA Source 
Strategene 


Hyseq Library Name 
EDT001 


SEQ ID NOS: 

32 40-41 49 74 79 101 111 120 132 
138 151 204-206 215-217 238 269 3L6 
414 433 505 510 513 550 555 580 582 
596 675 722 745 798 814 836-841 851 
918 976 1041 1043 1073 1083 1131 
1331 


Genomic clones 
from the short arm 


Genomic DNA 
from Genetic 
Research 


EPM001 


525-532 927 


Genomic clones 
from the short arm 

OI COX UHJUbVJlUv O 


Genomic DNA 
from Genetic 
Research 


EPM003 


47 525 


Genomic clones 
from the short arm 


Genomic DNA 
from Genetic 
Research 


EPM004 


525 927 


Genomic clones 
from the short arm 
of chromosome 8 


Genomic DNA 
from Genetic 
Research 


EPM005 


531 


esophagus 


BioChain 


ESO002 
FBR001 


74 138 238 
441-442 927 


fetal brain 
fetal brain 


Clontech 
Clontech 


FBR004 


215 893 927 1001 


fetal brain 


Clontech 


FBR006 
FBRs03 


48 61 101 120 132 138 140 147 208 
225 271 317 319 336 359 368 405-414 
519 550 571 594 686 715 722 764 824 
829 836 859 909 927 943 947 963 1057 
1067-1068 1104 1135-1140 1162 1206- 
1207 1235 1268 1288 1307-1308 1319 
1338-1350 
111 446 


fetal brain 
fetal brain 


Clontech 

Jli v ill u^cii 


FBT002 


41 51 120 151 192-194 264 504 512 
535 683 761 798 820-827 844 876 909 
963 976 1026 1048 1083 1 144 1302 


fetal heart 


Invitrogen 


FHR001 


446 566 761 


ieiai Kjaney 
fetal kidney 


Clontech 
Clontech 


FKD001 
FKD002 


5174 111 127 140 151 184 294 537 
550 630-631 1319 
111 976 1083 


fetal kidney 
fetal lung 


Invitrogen 
Clontech 


FKD007 
FLG001 


238 974 

463 566 976 1074 1083 1093 


fetal lung 


Invitrogen 


FLG003 


41 238 330 407 415-416 537 573 844 
859 1048 1083 1116 1192 


fetal liver-spleen 


Columbia 
University 


FLS001 


8 14 34-35 37 41 43 49 51 54-56 63-64 
69-71 74 77 79 87-90 101 107 110-111 
114 120 128-131 138 140 147 150-155 
197 210 215 217 225 238 312 367 384 
414 440 446 460 468 483 496 504-507 
511-515 518-519 523 533-535 537 541 
544-545 547-550 555-560 564 566 571 
577 582 585-586 598 636 646-647 649 
652 664 698 709-710 714 722-723 731 
735-736 746-753 761 784 798 823 829 
832 844 851 858-859 868 873 876 898 
927 943 949 952 963 976 984 1002 
1021 1023 1040 1042 1044 1050 1083 
1093 1116 1120 1129 1131 1144 1174 
1217 1251 1254 1256 1302 1308 1311 
1319 


fetal liver-spleen 


Columbia 
University 


FLS002 


8 36-37 41-46 49 54 64 71 74 79 101 
111 120 129 147207 210215-216238 
250 330 353 359 366 383-384 414 478 
505 508-509 51 1 515-524 534-535 537 
544-545 564 566 571 577 591 598 638 
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Tissue Origin 


RNA Source 


Hvseq Library Name 


SEQIDNOS: 

663 671 698 714 722 725 727 751 798 
851 859 873 876 909 927 949 952 983- 
984 1002 1023 1042-1044 1085 1095 
1131 1144 1178 1199 1233 1240-1270 
1331 1340 


fetal liver-spleen 


Columbia 
University 


FLS003 


64 535 976 1256 


fetal liver 


Invitrogen 


FLV001 


8 101 120 138 217 446 468 535 566 
580 722 730 749 844 918 943 976 1051 
1256 1331 


fetal liver 


Clontech 


FLV004 


537 926 1256 


fetal muscle 


Invitrogen 


FMS001 


51 111 264 312 369-370 404 417-421 
425 535 537 577 598 614 836 857 1 141 
1208 1268 


fptal muscle 


Invitrogen 


FMS002 


537 




Invitrogen 


FSK001 


13-26 32 41 51 89 107 111 147 151 
225 264 316 405 422-429 488-494 496 
519 534-535 537 566 675 732 859 876- 
877 898 947 949-950 963 976 1001 
1062 1076 1083 1117 1144 1165 1268 
1281 


ICLal 3M11 


Invitrogen 


FSK002 


537 812 1 


fetal spleen 


BioCbain 


FSP001 


87 549 


umbilical cord 


BioChain 


FUC001 


27-33 41 49 151 215 238 248-249 301 
3 1 6 446 495-503 5 1 9 52 1 534-535 537 
582 634 691 877 883 927 944-950 963 
976 1001 1075 1142-1143 1171 1218 
1243 1308 


ieiai Drain 


GIBCO 


HFB001 


41 49 57 79 87 103 111 120 132-135 
138 145 151 188 197 207 215 238 264 
271 294 316 367 414 440 446 466 504 
513-514 535 542-543 550 564 571 596 
635 648-654 675 711-715 722-723 798 
832 872 876 883 927 976 1095 1144 
1168 1171 1178 1211 1335 


macrophage 


Invitrogen 


HMP001 


238 


infant brain 


Columbia 
University 


IB2002 


49-50 77 81 89 105 111 136-138 140 
151 161 175-179 185 216-217 264 295 
299 308-310 371-373 462 476 504 51 1- 
513 533 537 564 566 571 655-657 662 
683 716-720 723 752 790-803 829 832 
858-859 876 898 909 949 976 1045- 
1047 1076-1087 1090 1093 1116 1122 
1144 1209-1213 1225 1233 1256 1319 
1341 


infant brain 


Columbia 
University 


IB2003 


41 50 77 104 132 215 238 508 512-513 
519 566 655 714 794 938 943 976 1067 
1092-1093 1233 


infant brain 


Columbia 
University 


IBM002 


311 472-473 753 1214 


infant brain 


Columbia 
University 


IBS001 


' 51 111 376 474 790 876 949 1144 1204 
1221 


lung , fibroblast 


Strategene 


LFB001 


151 316462 514 534 582 675 939 1131 


lung tumor 


Invitrogen 


LGT002 


1-7 41 74 79 94 115 120 138-139 156 
215 217 269 280 296 337 374-375 384 

Af\A AA£ ASA Al e s^A9.C\ 40S S14 518-^19 

522 537 545 564 577 597 653 658 705 
721-724 754-756 779 859 868 872-874 
876-877 919 927 949 951-952 959 976 
1002 1042 1048-1053 1076 1083 1088- 
1089 1131 1144-1147 1216-1218 1229 
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Tissue Origin 
lymphocytes 


RNA Source 
ATCC 


Hyseq Library Name 
LPC001 


SEQ ID NOS: 
1293 1311 

41 74 111 132 151 253 316 446 55U 
634 844 927 976 1085 1268 


leukocyte 


GIBCO 


LUC001 


8 11 41 74 86 91-98 101 109 111 120 
147 151 212215218238252 288312- 
3 14. 3 1 6 338 359 408 427 443-447 505 
510 512 514 518 534 545 549-550 561 
564 566 571 577 580 582 587-609 615 
632-638 658-659 698 714 725-728 832 
836 841 859 866 873-874 882-883 918- 
919 927 943 952 963 976 1042 1076 
1083 1090 1148 1152 1168 1195 1219- 
1220 1224 


leukocyte 


Clontech 


LUC003 


74 100 215 232 238 339-341 446 545 
657 660 729 873 883 927 952 963 1008 
1042 1116 1120 1149-1150 1215 1222 


Melanoma from cell 
line ATCC tfCRL 
1424 


Clontech 


MEL004 


210 215 238 342 534 545 592 722 873 
919 929 939 952 976 1071 1 1 18 1218 
1235 1245 


mammary gland 


Invitrogen 


MMG001 


8-10 40-41 49 73 80 1 14 13SM40 147 
217 250-256 264 297-299 305 377-378 
398 446 481-486 505 512 537 545 549 
571 592 725 730-733 816 829 836 844 
868 873 876-877 898 926 943 951-960 
963 976 995 1034 1042 1048 1054- 
1055 1076 1083 1091 1093 1116-1117 
1 124 1 152 1302 


induced neuron cells 


Strategene 


NTDO01 


39 101 111 138 238 361 1225 1251 
1319 


rp»tinmH arid induced 
neuronal cells 

11CU1U11GI V^il»«* 


Strategene 
Strategene 


NTR001 
NTU001 


74 225 976 

129 225 238 304 313 361 657 976 
976 


pituitary gland 
prostate . 


Clontech 
Clontech 
Clontech 


PIT004 

PLA003 

PRTOOl 


38 976 

1 1 1 188 238 257-258 564 724 961-966 
1067 1095 


rectum 


Invitrogen 


REC001 


238 430-431 841 859 868 963 1001 
1116 


salivary gland 


Clontech 


SAL001 


oKi Am *n?-4^ 446 496 868 952 
976 1083 1120 1151 1184 


small intestine 


Clontech 


SIN001 


8 101 147 215 259-266 446 462 505 
545 592 660 789 836 866 873 927 952 
963 967-978 1042 1120 1152 1223- 
1994 

238 302 927 943 992 1031 


skeletal muscle 

cninal ffird 


Clontech 
Clontech 


SKM001 
SPC001 


74 111 132 151215-216 238 264 267- 
270 343-344 353 379 516 537 566 740 
828 927 976 979-994 1092 1153-1159 
1225 1250 
698 859 1042 


adult spleen 
stomach 


Clontech 
Clontech 


SPLcOl 
STO001 


210 238 271-272 537 580 705 918 952 
995 1171 


thalamus 


Clontech 


THA002 


61 219-220 273-276 312 315 330 596 
963 996-1007 1059 1093 1160-1162 


thymus 


Clonetech 


THM001 


8 120 151 208 221 316-317353 639 
7Sfi R67 874 878-881 927 963 1023 
1083 1094-1096 1124 


thymus 


Clontech 


THMc02 


8 61 114 129 132 210 225 231 3U6 
317-319 336 340 359 380 398 446 448- 
463 512 519 545 554 587 598 698 724- 
725 789 812 836 868 873 927 947 952 
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Tissue Origin 


RNA Source 


Hvseq Library Name 


SEQ ID NOS: 

976 1007 1042 1083 1085 1097-1116 
1122 1147 1177 1226-12291234 1311 
1313 


thvroid eland 


Clontech 


THR001 


"14 41 49 76 94 111 144 151 183 188 
21 0 21 7 222 253 264 271 277-286 
320-326 345-352 361 381-382 446 467 
483 5 14 534 549-550 564 578 602 649 
844 882-883 927 950 956 976 1008- . 
1028 1076 1083 1117-1120 1142 1163- 
1175 1230-1238 1308 


trachea 


Clontech 


TRC001 


223-225 238 287 353-354 514 
545 592 611 873 883-884 927 
QS7 1029-1031 1042 1151-1152 
1170 1176-1177 1239 


uterus 


Clontech 


UTR001 


T51 226 288-290 355 537 877 
885-886 976 1001 1032-1033 
1232 



TABLE 2 



SEQ 
ID 

NO: 


Accession 
No. 


Species 


Description 


Smith- 
Waterman 
Score 


% 

Identity 


1 


B02829 


Homo sapiens 


Human G protein coupled receptor hRUP5 

protein SEQ ID NO: 10. 

Human secreted protein, SEO ID NO: 7645. 


460 
111 


100 1 
51 


2 
3 


G03564 
R26173 


Homo sapiens 
Homo sapiens 


Part of Major Yo paraneoplastic antigen 
(CDR62) encoded by clone pY2. 
calcium channel L-type alpha 1 subunit 


293 
191 


76 
65 


4 

5 

6 


L29536 
Y94943 

Ml 1507 


Homo sapiens 
Homo sapiens 

XlOmU SapiciiD 


Human secreted protein clone ytl4_l protein 
sequence SEQIDNO:92. 
transferrin receptor 


251 
120 


50 
95 


7 
8 

9 

10 


AF0991OO 
Y92338 

G01343 
AJ 133798 


Homo sapiens 
Homo sapiens 

Homo sapiens 
Homo sapiens 


" WD-repeat protein 6 
Human cancer associated antigen precursor from 
clone NY-REN-45. 

Human secreted protein, SEQ ID NO: 5424. 
copine VII protein 


1941 
245 

226 
1127 


93 
82 

91 
68 


11 
12 
13 


G02449 
X98330 
AL024498 


Homo sapiens 
Homo sapiens 
Homo sapiens 


Human secreted protein, SEO ID NO: 653U. 
ryanodine receptor 2 

dJ417M14.2 (novel scnne/threonine-protein 
kinase (ortholog of mouse and rat MAK (male 
germ cell-associated kinase)) 


584 
282 
293 


99 
78 
100 


14 


AF045577 


Pan 

troglodytes 


olfactory receptor OR93Ch 
"Human secreted protein, SEO ID NO: 7212. 


191 

93 


36 
39 


15 
16 


G03131 
U26595 


Homo sapiens 

Rattus 

norvegicus 


prostaglandin F2a receptor regulatory protein 
precursor 


569 


89 


17 


B08918 


Homo sapiens 


Human secreted protein sequence encoded by 
gene 28 SEQIDNO:75. 
Human secreted protein #75. 


99 
165 


44 

75 


18 
19 


Y36203 
U 15647 


Homo sapiens 
Mus 

musculus 


reverse transcriptase 

Human secreted protein, SEQ ID NO. 6782. 


106 
544 


40 | 
100 


20 
21 

22 


G02701 
Y35923 

G04030 


Homo sapiens 
Homo sapiens 

Homo sapiens 


Extended human secreted protein sequence, SEQ 
ID NO. 172. 

Human secreted protein, SEQ ID NO. 8111. 


1691 
380 


100 
96 


23 
24 

25 
26 
27 
28 


G02455 
AF036329 

GO4067 
S80119 
U83303 
G03267 


Homo sapiens 
Homo sapiens 

Homo sapiens 

Rattus sp. 
1 Homo sapiens 
| Homo sapiens 


" Human secreted protein, SEO ID NO: 6536. 

gonadotropin-relcasing hormone precursor, 

second form 
" Human secreted protein, SEQ ID NO: 8148. 

reverse transcriptase homolog 

line- 1 reverse transcriptase 

" Human secreted protein, SbQ ID NO: 7348. 


123 
284 

96 
100 
101 
135 


50 
90 

32 
34 
35 
45 
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SEQ i 
ID 1 
NO: 

30 


\ccession 5 
Mo. 

304067 1 
G02872 


>pecies 1 

flomo sapiens 
4omo sapiens 


■ | 

Description 

Human secreted protein, SEQ ID NO: 8148. 
Human secreted protein, SEQ ID NO. 6953. 
Human secreted protein, SbO ID NO: 7452. 


Smith- 
Waterman 
Score 
83 
116 
96 


Identity 
42 

11 

67 


31 
32 
33 
34 


G03371 
G03224 
Y66688 
Y 87071 J 


Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 


Human secreted protein, Miv iv nu. ijvj. 
Membrane-bound protein PROH52. 
Human secreted protein sequence SEQ ID 
NO:110. 


58 

2457 
348 

182 


32 
98 
95 

48 


35 
36 


U15131 
Y73464 


Homo sapiens 
Homo sapiens 


pl26 

Human secreted protein clone yl4_l protein 
sequence SEQ ID NO: 150. 


982 


90 


37 


AL133215 


Homo sapiens 


bA108L7.6 (scrnapbonn 4G (sema domain, 
immunoglobulin domain (Ig), transmembrane 
domain (TM) and short cytoplasmic domain)) _ 


687 


99 

"66 


38 


AC067969 


amino acids 
3338-4088 


Homo sapiens ryanodine receptor 1 (skeletal) 


386 




39 

40 
41 


AL031588 

G03628 
AF 132969 


Homo sapiens 

Homo sapiens 
Homo sapiens 


dJl 163J1.1 (mostly supported by GEN SCAN, 
FGENES and GENEW1SE) 
Human secreted protein, SEQ ID NO: 7709. 
CG1-3 5 protein 


493 

110 

228 


76 

51 
68 


42 
43 
44 


Y36268 
X61048 
M76546 


Hydra sp. 

Helianthus 

annuus 


Human secreted protein encoded by gene 45. 
mini-collagen 

hydroxyproline-rich protein 


220 
105 
110 


88 
35 
31 


45 
46 


U82288 
G03477 


Cacnorhabditi 
s elegans 
Homo sapiens 


Rac-like GTPase 

Human secreted protein, SEQ ID NO. 7558. 


139 

118 
113 


70 

58 
63 


47 
48 
49 


AF090942 

G03564 

AJ005560 


Homo sapiens 
Homo sapiens 
Mus 

musculus 


PRO0657 

Human secreted protein, SEQ ID NO: 7645. 
SPR2B protein 

Human secreted protein, SEQ ID NO: 6531. 


90 
72 

385 


59 
56 

98 


50 
51 

52 


G02450 
Y91649 

U93563 


Homo sapiens 
Homo sapiens 

Homo sapiens 


Human secreted protein sequence encoded by 
R enc 60 SEQIDNO:322. 
putative pi 50 


973 

105 
699 


94 

38 
85 


53 
54 
55 

56 


Y55927 
G026O7 
AB008175 

M68941 


Homo sapiens 
Homo sapiens 
Mus 

ITlLLSCU J Uo 

Homo sapiens 


Human STLK2 protein. 

Human secreted protein, SEQ lb NO. 66bb. 

hepatic nuclear factor 1-beta short form 

protein-tyrosine phophatase 


356 
165 


56 
74 

41 


57 
58 


AL031600 
AF011417 


Homo sapiens 
Mus 

musculus 


C390E6.1 (chloride channel 7) 
putative pheromone receptor 


338 
143 


76 
55 


59 


AF 167320 


Mus 

musculus 


zinc finger protein ZJP113 


558 
263 


68 
96 


60 
61 

62 


U/JUJO 

X07984 


Mus 

musculus 
Homo saoiens 


interferon regultory factor / . 

protein-tyrosine kinase 

Human secreted protein clone cb98 4. 


297 
791 


69 
98 


63 
64 


U35376 
AF265555 


Homo sapiens 
Homo sapiens 


repressor transcriptional factor 
ubiquitin-conjugating BIR-domain enzyme 
APOLLON 

' Human secreted protein, SEQ ID NO: 7964. 


485 
785 

88 


65 
74 

95 


OJ 

66 


AF 177390 


Homo sapiens 

Manduca 

sexta 


antenna] specific membrane protein AMP 


274 
614 


54 
100 


67 
68 

69 


AB040800 
AF030027 

G02965 


Homo sapiens 
Equine 
herpesvirus 4 
Homo sapiens 


24 

Human secreted protein, SEQ ID NO: 7046. 
Human oxidoreductase YTF03. 


213 

261 
1144 


" 26 

95 
98 


70 
71 
72 


W / J / / V 

AB011135 
AB014885 


Homo saDiens 
Homo sapiens 
Halocynthia 
roretzi 


KJAA0563 protein 
HrPOPK-1 


239 
813 


76 
78 


73 
74 


AF045454 
J02870 


Cavia 

porcellus 

Mus 


phospholipase B 

laminin receptor 


955 
308 


73 
61 
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SEQ 
NO: 


Accession 


Species 


Description 


Smith- 
Waterman 
Score 


% 

Identity 


75 


Y0O826 


musculus 

Rattus 

norvegicus 


gp2l0 (AA 1-1886) 


413 


84 


76 


AF117754 


Homo sapiens 


thyroid hormone receptor-associated protein 
complex component TRAP240 


351 
468 


54 

76 


77 
78 


Y38422 
Y14596 


Homo sapiens 
Homo sapiens 


Human secreted protein. 

Human T-type voltage-gated Ca channel alpha- 

I -I (hCavT3). 


1357 


99 


79 


Y14591 


Human 

nnnillomaviiu 

stype 68 


APM-1 protein 

dJ798A10.2 (K1AA0445 protein) 


767 
71 


100 
34 


80 
81 


AL137802 
AP000383 


Homo sapiens 

Arabidopsis 

thaiiana 


protein arginineM-methyltransferase-likc protein 


359 


65 


82 


L46815 


Mus 

musculus 


DNA binding protein Rc 


895 


75 
96 


83 
84 

85 


G01600 
Y53886 

AB029002 


Homo sapiens 
Homo sapiens 

Homo sapiens 


Human secreted protein, bt^ 1U mu: dooi. 
A suppressor of cytokine signalling protein 
designated HSCOP-6. 
KIAA1079 protein 


538 
134 


71 
42 


86 
87 


Y2oo/o 
Y99368 


llUI IlvJ 5a|Jlv*i*3 

Homo sapiens 


Human cw272 7 secreted protein. 
Human PR01326 (UNQ686) amino acid 
sequence SEQ ID NO: 100. 


325 
156 


62 
48 


88 

89 
90 


AJ225124 

At 1 / / J.\Jj 
Y28280 


Mus 

musculus 

Unmn ^aniens 

Homo sapiens 


hyperpolarization-activated cation channel, 
HAC3 

cerebral cell adhesion molecule 

Human G-protein coupled receptor GR1R-2. 

polycystic kidney disease-associated protein 


487 

290 
326 
1751 


95 

56 
79 
95 


91 
92 
93 
94 


L39891 
AF064876 
AF 170723 
XI 3292 


Homo sapiens 
Homo sapiens 
Homo sapiens 
Trypanosoma 
brucci 


ion channel BCNG-1 
protein kinase STK10 
GPl-phospholipase C (AA 1 - 358) 


953 
151 
661 


99 
37 
99 


95 
96 

97 


Y341z / 
X03638 

AF134213 


T-Inmn CPnlPfl^ 

JtTOino ^apic-no 

Rattus 
norvegicus 
Homo sapiens 


Human potassium channel K+Hnovl 1. 
sodium channel protein I (aa 1-2009) 

ubiquitin-specific protease 


1775 


92 
99 


98 
99 


G00838 
AF021935 


Homo sapiens 

Rattus 

norvegicus 


Human secreted protein, SEQ ID NO: 4919. 
mytonic dystrophy kinasc-relatcd Udc4Z-Dinaing 
kinase 

rmtative anion tranSDorter 1 


213 

O / J 

867 


38 
48 

98 


100 
101 


AF279265 
AC007878 


Homo sapiens 
Homo sapiens 


match to nuclear protein, NP220; note: sequence 
difference at residue 58 


160 


60 


102 


U22829 


Mus 

musculus 


P2Y purinoceptor 


264 


42 


103 


Y45023 


Homo sapiens 


Human sensory transduction G-protein coupled 
receptor-B3. 


516 
787 


99 
98 


1 CiA 

105 


VQdQQO 

I 7t77y 

Y87342 

r\T 1 07J i c 


Homo sapiens 
Homo sapiens 

Homo sapiens 


" Human secreted protein vb21 1, SEQ ID NO:20. 
Human signal peptide containing protein HSPP- 
119SEQIDNO:119. 
hepatic angiopoietin-reiated protein 


343 
212 


57 
67 


107 
108 


API 16657 
AEOOO401 


Homo sapiens 

Escherichia 

coli 


PRO1310 

sialic acid transporter 

Human secreted protein encoded by gene No. 10. 


74 
587 

693 


52 
96 

100 


1 no 
110 


Y38395 
Y78801 


Homo sapiens 
Homo sapiens 


Hydrophobic domain containing protein clone 
HP00631 amino acid sequence, 
nuclear pore complex protein hnupl53 


182 
464 


94 
85 


111 
112 

113 


Z25535 
Y94939 

AF016365 


Homo sapiens 
Homo sapiens 

Homo sapiens 


Human secreted protein clone ye90 J protein 
sequence SEQIDNO:84. 
hexokinase 1 isoform td 


274 

301 
520 


51 

71 
75 


114 

115 
116 

117 


AC007956 
M83738 
AL 157952 

W 18084 


Homo sapiens 
Homo sapiens 
Homo sapiens 

Homo sapiens 


unknown 

protein-tyrosine phosphatase 

dJ875K15.1.1 (ets homologous factor (ets- 
domain transcription factor ESE-3A, isoform 1)) 
Human Aurora-2. 


251 
484 

546 


92 
91 

87 
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SEQ 

ID 1 

NO: ! 


Accession 
Ho. 


Species 


Description 


Smith- 
Waterman 
Score 
407 


% 

Identity 
62 


118 
119 


L41816 
AJ006710 


Homo sapiens 

Rattus 

norvegicus 


cam kmase 1 

phosphatidylinositol 3 -kinase 


627 


93 


120 


AF026954 


Bos taurus 


pyruvate dehydrogenase phosphatase regulatory 
subumt precursor, VI) rr 


1646 


94 


121 
122 


S39392 
U6O805 


Homo sapiens 
Homo sapiens 


protein tyrosine phosphatase, PTPase (EC 

3.13.48) 

oncostatin-M specitic receptor beta subunu 


373 
262 


68 
88 


123 
124 


Y44403 
U88167 


Homo sapiens 
Caenorhabditi 
s elegans 


Human truncated tankyrase-1. 

contains similarity to C2 domains 


111 
219 


35 
29 


125 


ArjUUMo 


Unmn ^aniens 


guanine nucleotide binding protein beta subunit 
4 


693 


90 


126 


AB021861 


Mus 

musculus 


apoptosis signal-regulating kinase 2 


"153 


65 


127 
128 


AF305210 
M90360 


Homo sapiens 
Homo sapiens 


conccntrative Na+-nuclcoside cotransporter 

hCNT3 

protein kinase 

alpha 1C adrenergic receptor isoform_2 


807 

220 
574 


97 

73 
86 


129 
130 
131 

132 


D32202 

AF208043 

AF201734 

AF 112886 


Homo sapiens 
Homo sapiens 
Mus 

musculus 
Bos taurus 


IFI16b 

testis specific serine kinase-3 

differentiation enhancing factor j _ 


496 

ROD 

o\J\J 

159 


67 
87 

74 


133 
134 

135 


AJ278314 
W74802 

AB020335 


JtlOmu aapicuo 

Homo sapiens 
Homo sapiens 


phospholipasc C-beta-lb 

Human secreted protein encoded by gene 73 

clone HSQEL25. 

pan'T««-speciFic gene 


554 
1157 

668 


85 
87 

96 


136 
137 


W80408 
AC002563 


Homo sapiens 
Homo sapiens 


A secreted protein encoded by clone dt674 I. 
putative RHO/RAC effector protein; 95% 
similarity to P49205 (PID:gl 345860) 
PR03434 a novel secreted protein. 


OOO 

5041 
891 


98 
"99 

100 


138 
139 

140 


Y96736 
AB024034 

W97809 


Homo sapiens 
Arabidopsis 
thaliana 
Homo sapiens 


DNA-damage inducible protein DDIl-like 
Human GTPase regulator GRAF. 


147 


55 
56 


141 
142 


Y51557 
AF090113 


Homo sapiens 

Rattus 

norvegicus 


Human PLA2 protein. 

AMPA receptor binding protein 


125 
623 

641 


46 

93 

82 


143 
144 


W26642 
U87306 


Homo sapiens 

Rattus 

norvegicus 


Human RECK cancer-inhibiting proiein. 
transmembrane receptor UNC5H2 


578 


84 


145 
146 


AF264014 
W63683 


Homo sapiens 
Homo sapiens 


scavenger receptor cysteine-rich type 1 protein 

Ml 60 precursor 

Human secreted protein 3. 


727 
140 


92 
40 


147 
148 


D64014 


Unmn <iflnicns 

Escherichia 
coli 


galactosc-1 -phosphate uridyl transferase 
HrsA 


513 
818 


81 
90 


149 


M83316 


Escherichia 
coli 


pppGpp phosphohydrolase 


915 


95 
""99 


150 


AL 163279 


Homo sapiens 


homolog to cAMP response element binding and 
beta transducin family proteins 
STE20-likc kinase 


1261 
940 


99 


151 
152 

153 
154 


AF 179867 
R95332 

AF151859 
X66957 


Homo sapiens 
Homo sapiens 

Homo sapiens 
Homo sapiens 


Tumor necrosis factor receptor 1 death domain 
ligand (clone 3TW). 
CGI- 101 protein 
hexokinase type 1 


392 

370 
489 


61 

92 
81 


155 
156 
157 


Y16355 
G00857 
AF159455 


Homo sapiens 
Homo sapiens 
Mus 

TTiiicriilvK 


alternatively spliced form 
" Human secreted protein, SEO ID NO: 4938. 
zinc fmger protein 


432 
349 
352 

537 


92 
78 
74 

76 


158 
159 


L76191 
AP001743 


Homo sapiens 
Homo sapiens 


intcrleukin-1 receptor-associated kinase 
putative gene, ankinn like, possible dual 
specifity Ser/ThryTyr kinase domain 


670 


98 


160 
161 


AJ250425 
G02885 


Rattus 
norvegicus 
Homo sapiens 


Collybistin I 
— Human secreted protein, SbQ ID NO: 6966. 


556 

370 


74 
100 
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SEQ 
ID 
NO: 
162 


Accession 
^o. 

Z22968 


bpecies 
Homo sapiens 


Description 
Ml 30 antigen 


Smith- 
Waterman 
Score 
610 
336 


% 

Identity 

100 

92 


163 
164 
165 


AF181121 
AF055636 
AF 160798 


Homo sapiens 
Homo sapiens 
Rattus 
norvegicus 


ATP-dependent Ca2+ pump KMXl 
leucine-rich glioma-inactivated protein precursor 
calcium transporter Cal 1 


455 
700 


94 
96 


166 


Y76332 


Homo sapiens 


Fragment of human secreted protein encoded by 
gene 38. 

Human breast tumour-associated protein 68. 


327 
1 07? 


45 
99 


167 
168 


Y48607 
AB020741 


Homo sapiens 
Mus 

muscuiub 


NDC-related kinase 


197 

596 


43 
44 


169 
170 


AF252293 
U59429 1 


Homo sapiens 
Cricetinae 
gen. sp. 


PAR3 

diacylglycerol kinase eta 
phosphatidylserine-specitic phospholipase Al 


481 

386 


82 
42 


171 
172 


AF035268 
AF 127085 


Homo sapiens 
Mus 

musculus 


f »nnTiKr\n'n rvtnnla^mic domain-associated 
protein 3B 


507 | 


82 


173 


Y27918 


Homo sapiens 


Human secreted protein encoded by gene No. 
123 

Human secreted protein, SfcQ id nu: iuoO. 


653 
538 


99 
97 


174 
175 


G02979 
U36488 


Homo sapiens 
Mus 

musculus 


AmU«-imnir« ct*»m rpll nhosnhatase 
emDryomc sicni ^ui;aj/i»«iw"v 


168 


55 


176 


W95629 


Homo sapiens 


Homo sapiens secreted protein gene clone 
gml96 4. 

formiminotransferase cyclodeammase torm U 


1022 
255 


100 

93 


177 

178 


AF289023 
X04936 


Homo sapiens 
Homo sapiens 


T-ccll receptor alpha-chain (413 is 2nd base in 
codon) 


710 


99 


179 ' 

180 
181 


AF 127481 

G00978 
Y66645 


Homo sapiens 

Homo sapiens 
Homo sapiens . 


non-ocogenic Rho GTPase-specifte GTP 
exchange factor 

Human secreted protein, SfcQ ID NO: 505y. 
\yfi»mhrflnr-hmind nrotcin PKO1310. 


175 

517 
671 
862 


80 

94 

96 
100 


182 
183 
184 
185 

186 


AF 110640 
AB020854 
AF 169691 
AF 126372 

L20966 


Homo sapiens 
Bos taums 
Homo sapiens 
Homo sapiens 

Homo sapiens 


orphan seven- transmembrane receptor 
orphan transporter short splicing variant 
cadherin-like protein VR8 
*Vivrmtrnr»in-rpl«Lsinp hormone degrading 
ectoenzyme 
phosphodiesterase 


766 
375 
985 

541 


84 
38 
99 

76 


187 
188 

189 


G02920 
Y94918 

Y66713 


Homo sapiens 
Homo sapiens 

Homo sapiens 


u nm «n cecreted Drotein, SEQ ID NO: 7001. 
Human secreted protein clone dd504_18 protein 
cMnpnff <\FO ID NO:42. 
Membrane-bound protein PRO 13 09. 


254 
301 

694 


93 
98 

100 


190 
191 


G03244 
U36771 


Homo sapiens 

Rattus 

norvegicus 


Human secreted protein, SEQ ID NO: 7325. 
sn-glycerol 3 -phosphate acyltransferasc 


331 
707 


73 


192 


R05935 


Homo sapiens 


«;#»/*«»ti»H HPTIh subunit of multiple subunit 
polypeptide (MSP)GPIIb-IIIa. 


157 


72 


193 


M92084 


Theileria 
parva 


m«pin kinase II aloha subunit 
Mpmhrflnp-hound Drotein PRO1310. 


364 
448 


50 
90 


194 
195 


Y66645 
" W95631 


Homo sapiens 
Homo sapiens 


Homo sapiens secreted protein gene clone 
hj968 2. 


382 


49 


196 


AF255614 


Rattus 
norvegicus 


scaffolding protein SL1PR 


680 


99 


197 


AC021640 


AraDioopsis 
thaliana 


outative Dhosnhatidate phosphohydrolase 


300 


41 


198 

199 
200 


AF073967 

W0 1730 
AF 117948 


Mus 

musculus 
domesticus 
""Homo sapiens 
Homo sapiens 


olfactory receptor 

Human G-protein receptor HPRAJ70. 
pancreas-enriched phospholipase C 


316 

617 
625 


43 

98 
89 


201 

202 
203 

204 

205 


AF 128625 
AF1 17946 
Y53021 

AF227968 
S81752 


Homo sapiens 
Homo sapiens 
Homo sapiens 

Homo sapiens 
Homo sapiens 


CDC42-binding protein kinase beta 

Link guanine nucleotide exchange factor II 

Human secreted protein clone qc646_l protein 

sequence SEQEDNO:48. 

SH2-B beta signaling protein 

DPH2L=candidate tumor suppressor gene 


1303 
701 

182 
375 


94 

100 

99 

79 
100 
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SEQ 
ID 

NO: 

206 


Accession 
No. 

U18315 


Species 
Sus scrofa 


Description 

(ovarian cancer critical region of deletion} 
parathyroid receptor 

putative pheromone receptor V1RL1 long form 


Smith- 
Waterman 
Score 

122 
170 


% 

Identity 

60 
96 


207 
208 
209 
210 


AF255342 
S52051 
W63683 
D79992 


Homo sapiens 

Homo sapiens 
Homo sapiens 


neurotransmitter transporter 

Human secreted protein 3. 

similar to Drosophila photoreceptor cell-specific 

nrotein calohotin. 


715 
840 
541 

1348 


94 
99 
82 

99 


211 
212 


AF1 17948 
U81035 


Homo sapiens 

Rattus 

norvegicus 


pancreas-enriched phospholipase C 

ankyrin binding cell adhesion molecule 

neurofascin 

Tine fineer nrotein 


471 
798 


69 
56 


213 
214 


AF 154846 
AF 102777 


Homo sapiens 
Mus 

musculus 


FYVE finger-containing phosphoinositidc kinase 
putative gene containing transmembrane domain 


933 
523 


93 
89 


215 
216 

217 


AL163303 
U26595 

G04095 


Homo sapiens 
Rattus 
norvegicus 
Homo sapiens 


TM-rK:tnolftnHin F2a recentor regulatory protein 
precursor 

Human secreted protein, SEQ ID NO: 8176. 


563 

644 
314 


78 
98 

a i 


218 
219 
220 


X75756 
Y66723 
D88577 


Homo sapiens 
Homo sapiens 
Mus 

musculus 


protein kinase C mu 
Membrane-bound protein PRO 1 100. 
K-Uplicr ceil receptor 


770 
567 

853 


98 
40 

100 


221 

222 


AF258465 
AF021935 


Homo sapiens 

Rattus 

norvegicus 


OTRPC4 

mytonic dystrophy kinase-related Cdc42-binding 
kinase 


636 


96 


223 


AL136527 


Homo sapiens 


uAoumi i /'A virincp fPR K A ^ anchor Drotein 
DAx1j15]j«1 (A Kinase ^rivjvn.; aii^nvii F 1UIV4 

i LL> 


693 
690 


100 
99 


224 
225 


AB032417 
AF030430 


Homo sapiens 
Mus 

musculus 


WNT receptor Frizzled-4 
semaphorin Via 


703 


68 


226 


AE000218 


Escherichia 
coli 


putative dihydroxy acetone kinase (EC 2.7.1.2) 
phosphoinositol 3-phosphate-binding protein-2 


297 
2080 


39 
100 


227 
228 


AF302150 
AB024573 


Homo sapiens 
Mus 

musculus 


GTP-binding like protein 2 


265 


88 


229 
230 


AF 122924 
G03205 


Xenopus 
laevis 

Homo sapiens 


Wnt inhibitory factor- 1 

Human secreted protein, SEQ ID NO: 7286. 


316 

229 
265 


40 

100 
92 


231 

232 
233 

234 


X98260 
R92754 
R75111 

W69431 


Homo sapiens 
Homo sapiens 
Homo sapiens 

Homo sapiens 


M-phase phosphoprotein 1 1 

Human growth differentiation factor- 12. 

Glycosyl-phosphatidylinositoJ-specific 

phospholipase-D. 

Human secreted protein cwl233_3. 
serine palmitoyltransferase, subunit II 


682 
290 

235 
859 


95 
97 


235 
236 
237 


Y08686 
AF 11 8275 
X81466 


Homo sapiens 
Homo sapiens 
Mus 

musculus 


atrophin-related protein ARP 
Embryo Brain Kinase 


117 
460 


37 

OZ 
11 


238 


U64857 


Caenorhabditi 
s elegans 


similar to the BPTL/Kunitz family of inhibitors; 
most similar to tissue factor pathway inhibitor 

nn* cursor fTFPD 


284 


55 


239 


AJ250840 


Mus 

musculus 


serine/threonine protein kinase 


739 


63 


240 


AJ223472 


Mus 

musculus 


transcription elongation factor TFlIS.h 


222 


JO 


241 


Y94906 


Homo sapiens 


Human secreted protein clone rt>649_3 protein 

sequence SEQ ID NO: 18. 

Na-f /sulfate cotransporter SUT-1 


353 
591 


52 
""99 


242 
243 


API 69301 
L22022 


Homo sapiens 

Rattus 

norvegicus 


orphan transporter v7-3 


667 


93 


244 

245 
246 
247 
248 
249 


AF016191 

AF097366 
Y29868 
AF 180475 
Y17227 
AF250910 


Rattus 
norvegicus 
Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 
Manduca 


potassium channel 

cone sodium-calcium potassium exchanger 
Human secreted protein clone pp325 9. 
Not4-Np 

Human secreted protein (clone yal-1). 
' death-associated small cytoplasmic leucine-rich 


1043 

645 
497 
188 
690 
182 


98 
98 

98 1 

83 

99 

31 
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SEQ 
ID 

NO: 


Accession 
No. 


Species 


Description 


Smith- 
Waterman 
Score 


% 

Identity 






sexta 


nrnt^in SCI P 






250 


AF 192756 


Kaposi's 
sarcoma- 

•a cent- 1 at^rl 

herpesvirus 


Orf73 


134 


34 


251 


a TiDllAOA 




MOK nmtein kinase 


209 


83 


2jZ 


W j j LH j 


1— I /-\ r-r-\ S~i CR p 1 i»n c 

nuiriu btipicuj 


Neural adhesion molecule (ethbO018f2 product). 


469 


100 


253 


L46815 


Mus 

musculus 


DNA binding protein Rc 


251 


67 


254 


Woo jUj 


nomu i> dpi cub 


Unman acid sensinc ionic channel. 


173 


82 


255 


AF070066 


Mus 

musculus 


Citron-K kinase 


1201 


98 


256 


G02491 


Homo sapiens 


Human crcrprpri nrnteirL SEO ID NO: 6572. 


460 


100 


257 


Z12841 


Oryctolagus 
cuniculus 


r nospuoiipooc 


368 


80 


258 


Y95436 


Homo sapiens 


Human calcium channel SOC-3/CRAC-2 


1857 


99 


259 


AJ222968 


Mus 

musculus 


L-penaxin 


430 


72 


260 


AJ250839 


Homo sapiens 


serine/threonine protein kinase 


861 


100 


261 


AJ249977 


Homo sapiens 


/\JVlr -aCXlValCU prUlCUl MlltOC g,cuiiiiia «j juuuim 


758 


98 


262 


AF141386 


Rattus 
norvegicus 


SLIT-2 

— — ■ — • 


198 


40 


263 


AF022859 


Homo sapiens 


neuropil in - 2 ( aO) 


335 


62 


264 


AF 160477 


Homo sapiens 


Ig superfamily receptor LN1R precursor 


387 


91 


265 


Y44662 


Homo sapiens 


Human 1427.3 u-protein coupieu receptor 
(OPCR). 


636 


99 


266 


U27269 


Mus 

musculus 


sodium glucose cotransporter 


204 


56 


267 


AF124491 


Homo sapiens 


ARF GTPase-activating protein GIT2 


159 


75 


268 


AF 127389 


Rattus 
norvegicus 


putative taste receptor TR1 


209 


39 


269 


X98296 


Homo sapiens 


ubiquitin hydrolase 


215 


95 


270 


X78482 


Streptococcus 
pyogenes 


Fc-garnma receptor 


129 


26 


271 


AB009883 


Nicotiana 
tabacum 


KED 


109 


26 


272 


AF 137367 


Mus 

musculus 


VPS 10 domain receptor protem oUrvv-.o 


899 


97 


273 


L34938 


Rattus 
norvegicus 


ionotropic glutamate receptor 


460 


86 


274 


AL022724 


Homo sapiens 


dJ41jrlo.i.l (namsier /\nurogen-ucpeiiucni 
Pvnr^«t«^H Pmtrin T TXK PUTATIVE Drotcin^ 
(isofonn 1) 


188 


74 


275 


AF265555 


Homo sapiens 


APOI LON 


173 


"94 


276 


LjUzo 1 1 


riorno sapiens 


Human secreted Drotein SEO ID NO: 6953. 


148 


56 


277 


L4uJoU 


nOmO 5iipiCUD 


thyroid receptor intcr&ctor 


430 


61 


278 


AJB046851 


Homo sapiens 


KIAA1631 protein 


283 


96 


279 


AL0U8O75 


Arabidopsis 
thai i ana 


Contains PF|00069 Eukaryotic protein kinase 
domain. 


157 


43 


280 


Mo3 /Jo 


Homo sapiens 


nmtp in— H/mcin^ nho^nhfltase 


181 


73 


281 


AK024397 


Homo sapiens 


unnamed protein product 


439 


91 


282 


AF 141326 


Homo sapiens 


dma hpliease HDB/DICE1 


497 


84 


283 


AF156530 


Mus 

musculus 


PT^L^nmnin tmn^rrintionfll renressor PE1 

f> 1 »t~vHJ II 1*1111 LI CUlOVsi IfJUV/LIUI 1 LpiWJU* J. a 


605 


76 


284 


Vim "3 £. 


riomo sapiciib 


Human secreted protein clone cs756 2 alternate 
reading frame protein. 


647 


100 




Y73402 


Homo sapiens 


Human secreted protein clone yc25__l protein 
sequence SEQIDNO:26. 


300 


90 


286 


AF016411 


Homo sapiens 


KCNA3.1B 


137 


100 


287 


W89253 


Homo sapiens 


Human ALP. 


688 


97 


288 


AF1 12886 


Bos taurus 


differentiation enhancing factor 1 


750 


96 


289 


AF1 13131 ; Homo sapiens 


host cell factor homolog LCP 


367 


44 


290 


U5211I I Homo sapiens 


p lex in-related protein 


698 


100 


291 


AF026504 \ Rattus 


SPA- 1 like protein pi 294 


603 


89 
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SEQ 

ID 1 

NO: 


\ccession J 
So. 


Species 1 


• ■ ~ 

Description 


Smith- 
Waterman 
Score 


% 

Identity 




AF 102854 


norvegicus 

Rattus 

norvegicus 


membrane-associated guanylate kinase- 
interacting protein 2 Maguin-2 


124 


53 


293 


X99211 


Drosophila 
melanogaster 


ubiquitin-specific protease 


143 


38 
94 


294 
295 


Y94943 
Y94890 


Homo sapiens 
Homo sapiens 


Unman secreted nrotein clone ytl4 1 protein 
sequence SEQIDNO:92. 
Human protein clone FEP02798. 


185 

108 
154 


59 
96 


296 
297 
298 


AFO 19767 

Y28568 

Y94943 


Homo sapiens 
Homo sapiens 
Homo sapiens 


zinc finger protein 

Secreted peptide clone bd577 1. 

Human secreted protein clone ytl4J protein 

sequence SEQ ID NO:92. 


568 


84 

y f 


299 
300 

JU I 


B08906 

R58890 
AF022859 


Homo sapiens 

Homo sapiens 
Homo sapiens 


Human secreted protein sequence encoded by 
K ene 16 SEQ ID NO:63. 

Human-32 cadherin-related molecule. 

neuropil in-2(a0) 

Human mitogenic regulator duox2. 


605 

212 
277 
716 


69 

97 
100 

on 
y / 


302 
303 
304 
305 


Y71124 1 
Y44297 
D32050 
U43586 


Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 


Human receptor tyrosine kinase. 

alanyl-tRNA synthetase 

protein kinase related to Raf protein kinases; 
Method: conceptual translation supplied by 
author 

Human H13 viral receptor mutant 4. 


228 
192 
428 

280 


97 
80 
72 

95 


"306 
307 


R54872 
D78572 


Homo sapiens 
Mus 

musculus 


membrane glycoprotein 


199 


41 

88 


308 
309 


AF255614 
S79463 


Rattus 
norvegicus 
Mus sp. 


scaffolding protein SLIPR 
scmaphorin homolog'=M-Sema F 


639 
162 


89 


310 
311 


AF178941 
U03413 


Homo sapiens 
Dictyostelium 
disco ideum 


ATP-binding cassette sub-family A member 2 
calcium binding protein 


736 
151 


100 
36 

100 


312 


Y87347 


Homo sapiens 


Human signal pepude containing pruiciu n^r 
124 SEQIDNO:l24. 


744 
789 


99 


313 
314 


Z97055 
AC004010 


Homo sapiens 
Homo sapiens 


dJ388M5.4 (putative vj£>z luce pruicm; 

similar to Leucine-rich transmembrane proteins, 

44% similarity to U42767 (PID:r 17369 18) 


197 


38 


315 


AL021392 


Homo sapiens 


dJ439F8.2 (supported by GENSCAN and 
GENE WISE) 


278 


38 


316 


U702O9 


Mus 

musculus 


polycystic kidney disease 1 protein 


165 


38 
38 


317 


AF 109643 


Rattus 
norvegicus 


coxsackie-adenovtrus-receptor homolog 


223 
138 


84 


318 
319 

320 


AF104923 
AF 100287 

GO0588 


Homo sapiens 
Trypanosoma 
vivax 

Homo sapiens 


putative UaflSCnpilUll laocui , , 

activated protein kinase C receptor homolog 
" Human secreted protein. SEQ ID NO: 4669. 


141 
125 


38 

C 1 

J 1 


321 
322 


Y21591 
D26O70 


Homo sapiens 
Homo sapiens 


Human secreted protein (clone CC332-33). 
human type 1 inositol 1,4,5-trisphosphate 
receptor 


459 
232 


97 
97 

88 


323 


Y27918 


Homo sapiens 


u„ m „ n cmrrrirA nrntein encoded DY RCne No. 
riuman secrcicu jji ul^ui wuwuwu vj & 

123. 

f neuronal thread protein AD7c-NTF 


306 
209 


70 


^24 
325 

326 
327 


AF010144 
M19650 

W80396 
X75756 


Homo sapiens 
Homo sapiens 

Homo sapiens 
Homo sapiens 


2 , ,3'-cyclic-nucleoude 3'-phosphodiesterase (EC 
3.'l.4.37) 

A secreted protein encoded by clone opo^o iu. 
protein kinase C mu 


214 

540 


97 

70 
78 


328 
329 
330 


G02292 
API 68990 
S67984 


Homo sapiens 
Homo sapiens 
Homo sapiens 


""Human secreted protein, SEQ ID NO: 63 fi. 
putative GTP-binding protein 
anti-HlV gpl20 antibody heavy chain variable 
region 

LDL-receptor related precursor (AA -19 to 4!>ZD) 


721 
877 
581 

2823 


99 
99 
80 

98 


331 
332 

333 
334 


X13916 
Y87330 

Y28503 
AC002563 


Homo sapiens 
Homo sapiens 

J Homo sapiens 
Homo sapiens 


Human signal peptide containing protein HSPP- 
107 SEQ ID NO: 107. 

HGFH3 Human Growth Factor Homologue 3. 
" ' putative RHO/RAC ettector protem; 95% 


1127 

320 
327 


100 

98 
93 



111 



PCT/US01/03800 

WO 01/57188 



SEQ 1 

ID 

NO 


Accession 
No. 


Species 


Description 

similarity to P49205 (PlD:glJ45860) 


Smith- 
Waterman 

Score 


% 

Identity 


J J J 


Y 873 47 


Homo sapiens 


Human signal peptide containing protein HSW- 
124 SEQ ID NO: 124. 


1111 


67 


336 


AF006466 


Mus 

musculus 


lymphocyte specific formin related protein 


193 


75 
97 


337 
338 


AF265555 
Y 13443 


Homo sapiens 
Homo sapiens 


..u^nitin rrmiiiontincr RTR -domain enzyme 
APOLLON 

Amino acid sequence ol hblo3-2. 
putative GABA-gated chloride channel 


632 

516 
189 


100 
100 


339 
•"340 
341 


Y07637 
Y05734 
AJB 000497 


Homo sapiens 
Homo sapiens 
Escherichia 
coli 


Human Grb7 ertector protein. 
L-idonate transcriptional regulator 


2156 
928 


99 
98 


342 


D90855 


Escherichia 
coli j 


glycerol-3 -phosphate dehydrogenase (EC 
1.1.99.5) chain A, anaerobic 


769 


99 


343 


D85613 


Escherichia 
coli 


membrane component 


399 


100 


344 


M93239 


Escherichia 
coli 


transmembrane protein 


232 


100 


345 


M60177 


Escherichia 
coli 


enterobactin 


759 


99 


346 


D90699 


Escherichia 
coJi 


Sensor protein copS (EC 2.73.-). 


638 


97 
100 


347 


D90843 


Escherichia 
coli 


CapB protein. 


552 


"96 


348 


M13422 


Escherichia 
coli 


49 kd protein 


1193 




349 


LI 0328 


Escherichia 
coli 


similar to drug resistance translocases 


340 


90 
~82 


350 


X69942 


Mus 

musculus 


enhancer-trap-locus- 1 


JOKJ 




351 


AF239613 


Homo sapiens 


apamin-sensitive small -conductance Ca2+- 
activated potassium channel 


463 


80 


352 


"D90777 


Escherichia 
coli 


3-hydroxybutyryl-CoA dehydrogenase (EC 
1.1.1.157) (b- hydroxybutyryl-CoA 
dehydrogenase) (BhbD). 


577 


100 


353 


D90863 


Escherichia 
coli 


similar to 

Human transmembrane protein HP02000. 


311 
133 


98 
58 


354 
355 

356 


Y52386 
Y58637 


Homo sapiens 
Hnmn ^aniens 

Homo sapiens 


Human transport-associated protein-7 (TRANP- 

7> 

Protein regulating gene expression PKCifc-3U. 


482 
119 


55 
51 


357 
358 

359 


AF 11 9226 
Y87219 

J00132 


Homo sapiens 
Homo sapiens 

Homo sapiens 


dual-specificity tyrosine phosphatase YVH1 
Human secreted protein sequence SEQ ID 
NO:258. 
beta-fibrinogen 
■ . ■ . i ceo rn "WO - 7870 


1788 
165 

233 
128 


100 
100 

93 
70 


360 
361 
362 

363 
364 
365 


G03789 
R28916 
U16655 

G03119 
U47276 
G03789 


Homo sapiens 
Homo sapiens 
Rattus 
norvegicus 
Homo sapiens 
Gallus gallus 
Homo sapiens 


Human secreted protein, sfcy lu ]NW - ;o /u> 
Type III procollagen (prior art), 
phospholipase C delta-4 

""Human secreted protein, SEQ ID NO: 72UU. 
chicken brain tactor-2 

Human secreted protein, SEQ ID MO: 7870. 


108 
649 

95 

104 

183 


40 
65 

42 
34 
65 


366 

JO / 

368 
369 


G04091 
X98258 
AL021366 
U70932 


Homo sapiens 
Homo sapiens 
Homo sapiens 
Perorayscus 
leucopus 


" Human secreted protein, SEQ ID NO: 8172. 
M-phase phosphoprotein 9 
C1CK0721Q.3 (Kinesin related protein) 
reverse transcriptase 


118 
564 

3387 
92 


46 

75 
99 
59 


370 
371 


X86400 
G03172 


Homo sapiens 
Homo sapiens 


gamma subunit of sodium potassium ATPase 
like 

Human secreted protein, SEQ ID NO. 


242 

165 
257 


73 

56 
55 


372 
373 
374 

375 


U49974 
X13916 
AF234765 

U49974 


Homo sapiens 
Homo sapiens 
Rattus 
norvegicus 
Homo sapiens 


mariner transposase 

LDL-receptor related precursor (AA -IV to 4!>2:>J 
serine-arginine-rich splicing regulatory protein 

mariner transposase 


21193 
1182 

172 


99 
78 

1 55 



112 



WO 01/57188 



PCT/US01/03800 



SEQ 

T1"\ 

ID 

NO: 
376 


Accession 

[NO. 

GO 1984 


Species 
Homo sapiens 


Description 

Human secreted protein, SEQ ID NO: 6065. 


Smith- 
Waterman 
Score 
221 


% 

Identity 
67 


377 
378 

379 


G00669 
X52574 

R69095 


Homo sapiens 
Mus 

musculus 
Homo sapiens 


Human secreted protein, SEQ ID NO: 4750. 
GTP binding protein 

Anti-HIV Fabtat31 light chain. 


600 
1456 

68 
125 


100 
91 

37 
37 


380 
381 
382 

383 
384 
385 
386 
387 
388 


J04974 

AB002405 

U64830 

G02916 

G01194 

AJ245822 

D86974 

G03203 

G04072 


Homo sapiens 
Homo sapiens 
Dictyostelium 
discoideurn 
Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 


alpha-2 type XI collagen 
LAK-4p 

protein tyrosine kinase 

Human secreted protein, SEQ ID NO: 6997. 
Human secreted protein, SEQ ID NO: 5275. 
tvpe I transmembrane receptor 
K1AA0220 

Human secreted protein, SEQ ID NO: 7284. 
Human secreted protein, SEQ ID NO: 8153. 


530 
115 

618 

617 

4560 

2148 

142 

99 


43 
44 

98 

93 

100 

98 

50 

59 


389 
390 
391 
392 
393 
394 
395 


M12140 

AJ293309 

Y42751 

W48351 

Y 14442 

W85607 

Y76332 


Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 


envelope protein 
NHP2 protein 

Human caicium binding protein 2 (CaBP-2). 

Human breast cancer related protein BCRB2. 

olfactory receptor protein 

Secreted protein clone da228 6. 

Fragment of human secreted protein encoded by 

Human secreted protein, SEQ ID NO: 8011. 


197 
461 
181 
241 
339 
957 
171 

250 


51 

77 

94 

66 

54 

100 

34 

100 


396 
397 


G03930 
AB032904 


Homo sapiens 

Hylobates 

syndactylus 


dopamine receptor D4 
stromal antigen 3, (STAG3) 


105 
Sol 


35 

RS 


398 
399 


AJ007798 
Y91405 


Homo sapiens 
Homo sapiens 


Human secreted protein sequence encoded by 

gene 2 SEQ ID NO: 126. 

Human secreted protein clone cb98 4. 


1047 

lb J. 


92 

J / 


400 
401 

402 


Y29861 
D87002 

AF 100754 


Homo sapiens 
Homo sapiens 

Homo sapiens 


similar to rat integral membrane glycoprotein, 

accession number Z21513. 

ancient ubiquitous protein AUP1 isoform 


527 
853 


78 
95 


403 
404 


X74904 
AF075462 


Gallus gailus 
Mus 

musculus 


ninhrt-^-macrofflobuIin receptor 
ADP-ribosylation factor-directed GTPase 
activating protein isoform b 


258 
545 


60 
89 


405 
406 


X92887 
Y30162 


Human 
endogenous 
retrovirus K 
Homo sapiens 


pol/env 

Human dorsal root receptor 4 hDKK4. 


162 
2833 


30 

72 
99 


407 
408 
409 


AK022626 

L13802 

Y91600 


Homo sapiens 
Homo sapiens 
Homo sapiens 


unnamed protein product 

ribosmal protein small subunit 

Human secreted protein sequence encoded by 

gene 9 SEQ ID NO:273. 


Zo4 
1788 


89 


410 


W88745 


Homo sapiens 


Secreted protein encoded by gene 30 clone 
HTSEV09. 


2004 


99 


411 


AB043953 


Mus 


Chat-H 


2628 


82 


412 


Y86233 


Homo sapiens 


" Human secreted protein HNTMX29, SEQ ID 
NO: 148. 


1014 


92 


413 

1** 

415 


U 10542 
G03203 


Pan 

troglodytes 
Homo sapiens 
Homo sapiens 


MHC class I A 
NY-REN-7 antigen 

Human secreted protein, SEQ ID NO: 7284. 
Human transmembrane protein HTMPN-35. 


265 

850 

88 

266 


71 

95 
48 
89 


416 
417 
418 
419 


Y57911 
W27651 
Y76884 
AF255559 


Homo sapiens 
Homo sapiens 
Homo sapiens 
NOiouienia 
coriiceps 


Secreted protein AT205. 

Retinoblastoma binding protein-7sequence. 

alpha tubulin 

Human secreted protein, SEQ ID NO: 6065. 


481 

3077 

289 

209 


60 
87 
68 

J 74 


420 
421 


GO 1984 
AL 3 09827 


Homo sapiens 
Homo sapiens 


dJ309K20.2 (acrosomal protein ACR-55 (similar 
to rat sperm antigen 4 (SPAG4))) 


1446 


' 96 

■ru 


422 


ACOO8075 


Arabidopsis 
thaliana 


F24J5.4 


112 


1 



113 



WO 01/57188 



PCT/USO 1/03800 



crr\ 

bbQ 
LU 

lS\J. 


Accession 

INC. 


Species 


De^rintion 


Smith- 
Waterman 

Score 


% 

Identity 


All* 
4ZJ 


AF711705 


Homo sapiens 


Alu co -repressor 1 


1090 


100 


A1A 


AF714887 


Homo sapiens 


FLAMINGO 1 


6268 


97 


425 


Y35942 


Homo sapiens 


Extended human secreted protein sequence, SEQ 
ID NO. 191. 


1961 


99 


A16. 
4Z0 




Unmn ^aniens 


N-copine 


635 


98 


All 


L12392 


Homo sapiens 


Huntington's Disease protein 


16080 


99 


428 


Y94990 


Homo sapiens 


Human secreted protein vb21_ 1, SEQ ID NO:20. 


768 


98 


A Of} 




llUIllU sajJIvlid 


7inc finper nrotein Cezanne 


542 


87 


43U 


I 5444 1 


rlOIUU :><ip ICHo 


Amino acid sequence of a human RNA- 
associated protein. 


2074 


100 


A1 1 
43 1 




Vlriinn ^aniens 


Human secreted protein, SEQ ID NO: 6931. 


723 


95 




OU4U0 / 


UnmA CftnlPn^ 


Human secreted protein, SEQ ID NO: 8148. 


73 


42 


433 


Ax 1 jy/Vo 


L^y COpciblLUIl 


f»Ytensin-like Drotein 


613 


48 


434 


W4ojj 1 


IlUinu aajjltlii 


Human breast cancer related protein BCRB2. 


135 


44 


43j 


A /30 /4 


Vlr»mr\ canif*n ^ 
nuuiu aupicua 


phosphorylase kinase 


3442 


97 


43d 


A P 1 A 1 Alf\ 


JIUll l U oauitiij 


HSPC308 


268 


74 


437 


vino 1 0 


1 J r> r>n r\ rantPIlC 

riomo bdpicnt* 


Human secreted protein encoded from gene 2. 


1055 


52 


438 


VjU3 fyo 


V-Trxrv\ r~\ CI n 1 n C 

nOlTlO >ap 1CI15 


Human secreted protein, SEQ ID NO: 7879. 


168 


56 


439 


X14700 


Homo sapiens 


GABA-A receptor alpha 1 subunit 


2294 


96 


440 


\TtT2A A 


riomo bapiciid 


h^ta-tiihulin 

l/w \r£* I U L/Ul 14 1 


311 


95 


441 


Ar loo4 Jo 


riomo Sapiens 


nrti*v»tini> ^ipnal cointe£rator 1 


1882 


100 


442 


L 1 1 672 


Homo sapiens 




795 


54 


443 


G03203 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7284. 


93 


26 


444 


A52140 


unidentified 


tti TKX A XI "KTHR 
riUIvlAIN JNLylN. 


2451 


100 


445 


X98330 


Homo sapiens 


rvajiUQiriv rccc|jiui x. 


9356 


99 


446 


AF1 16712 


Homo sapiens 




227 


49 


447 


AF245447 


Homo sapiens 


SpnillgOSiric Kiu<t>c ijyt' lauiuwi' 


576 


99 


448 


AJF133086 


Homo sapiens 


rv»*»tnKrar\^»-tvrv» c/rinp nroteasc 1 


2630 


94 


449 


U87305 


Rattus 
norvegicus 




817 


93 


450 


AF081249 


Homo sapiens 


TA Wl -r^lfltrrl nrntein MR VII A lone isoform 


4568 


99 


451 


AC005498 


Homo sapiens 


R31665 1 


316 


62 


452 


M60235 


Homo sapiens 




464 


73 


453 


AB036706 


Homo sapiens 


■ j> 1 ifcl tmj fi - HI 

lniciccuu 


730 


88 


454 


G00918 


Homo sapiens 


Human secreted protein, SEQ ID NO: 4999. 


263 


81 


455 


Y22634 


Homo sapiens 


Muman cyioKinc lnuuoiuic icguiawvi/ piuvwn* « 


192 


67 


456 


Y36705 


Homo sapiens 


Fragment of human secreted protein encoded by 
gene 62. 


106 


40 


457 


N91325 


Homo sapiens 


DNA encoding human growth hormone receptor. 


3282 


% 


458 


M19155 


Plasmodium 
falciparum 


o-aJiLigcii prcvuraui 


110 


36 


459 


Y 13377 


Homo sapiens 


Aminn oriH cpnn^nrp f\f nrotein PR0257 


509 


98 


460 


Y02693 


Homo sapiens 


Human secreted protein encoded by gene 44 

rlnn<* HTDAD22. 


149 


43 


461 


Y 14482 


— r : — - — 

Homo sapiens 


Praarrtrnt of human secreted Drotein encoded by 
gene 17. 


184 


54 


462 


Y53005 


Homo sapiens 


Human secreted protein clone pm749 8 protein 
sequence SEO ID NO: 1 6. 


135 


47 


463 


X84960 


Triticum 

9 ^ Cti \/i 1 m 


low molecular weight glutenin 


109 


33 


4o4 


W 1 O0 1 0 

w j yy 1 7 




Human Ksr-1 (kinase suppressor of Ras). 


1781 


85 




A-T Joy /04 


j\dus 

musculus 


alpha/beta hydrolase- 1 


502 


59 


400 


uy3 joy 




p40 


101 


30 


A £.1 

467 


Y41 jZo 


riurnu iapicua 


Fragment of human secreted protein encoded by 
gene 77. 


1172 


99 


468 


G02872 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6953. 


149 


52 


469 


AJ000008 


Homo sapiens 


P 13 -kinase 


5832 


97 


470 


X70922 


Mus 

musculus 


neurotoxin homologue 


118 


47 


471 


G03797 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7878. 


198 


75 


472 


Y36705 


Homo sapiens 


Fragment of human secreted protein encoded by 


72 


57 



114 



WO 01/57188 



PCT/US01/03800 



SEQ 
ID 

NO: 


Accession 
No. 


Species 


Description 


Smith- 
Waterman 
Score 


% 

Identity 








gene 62. 






473 


G02313 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6394. 


328 




474 


Y07007 


Homo sapiens 


Breast cancer associated antigen precursor 
sequence. 


1013 


97 


475 


W93254 


Homo sapiens 


Human ESRP1 protein. 


943 


80 


476 


W48351 


Homo sapiens 


Human breast cancer related protein BCRB2. 


236 


65 


477 


Y02693 


Homo sapiens 


Human secreted protein encoded by gene 44 
clone HTDAD22. 


202 


60 


478 


G01870 


Homo sapiens 


Human secreted protein, SEQ ID NO: 5951. 


267 


100 


479 


AF102777 


Mus 

musculus 


FYVE fmger^ontaining phosphoinositide kinase 


3427 


92 


480 


G03052 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7133. 


123 


53 


481 


W87701 


Homo sapiens 


A human membrane fusion protein designated 
SYTAX1. 


221 


77 


482 


G03119 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7200. 


131 


39 


483 


AF2 10651 


Homo sapiens 


NAG18 


124 


59 


484 


AF010144 


Homo sapiens 


neuronal thread protein AD7c-NTP 


343 


50 


485 


G00637 


Homo sapiens 


Human secreted protein, SEQ ID NO: 4718. 


129 


70 


486 


U15174 


Homo sapiens 


BCL2/adenovirus E1B 19kD-interacting protein 
3 


149 


73 


487 


Y76167 


Homo sapiens 


Human secreted protein encoded by gene 44. 


627 


100 


488 


AJ275213 


Homo sapiens 


stabilin-1 


1244 


91 


489 


G03798 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7879. 


313 


65 


490 


LI 2392 


Homo sapiens 


Huntington's Disease protein 


16081 


100 


491 


G03789 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7870. 


197 


66 


492 


J03799 


Homo sapiens 


laminin-binding protein 


228 


70 


493 


U15174 


Homo sapiens 


BCL2/adenovirus E1B 19JdCMnteracting protein 
3 


128 


41 


494 


Y02693 


Homo sapiens 


Human secreted protein encoded by gene 44 
clone HTDAD22. 


197 


67 


495 


AC005175 


Homo sapiens 


R31449 3 


889 


94 


496 


G03786 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7867. 


229 


61 


497 


AB030237 


Canis 
familiaris 


D4 dopamine receptor 


90 


48 


498 


G02872 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6953. 


228 


65 


499 


U70935 


Peromyscus 
maniculatus 


reverse transcriptase 


213 


52 


500 


U48508 


Homo sapiens 


skeletal muscle ryanodine receptor 


26406 


99 


501 


G03371 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7452. 


105 


58 


502 


AF119851 


Homo sapiens 


PRO 1722 


156 


62 


503 


AF1 13685 


Homo sapiens 


PRO0974 


116 


50 


504 


U79458 


Homo sapiens 


WW domain binding protein-2 


322 


59 


505 


W29651 


Homo sapiens 


Human secreted protein CD124_3. 


608 


55 


506 


W85459 


Homo sapiens 


Secreted protein encoded by clone dhl 135_9. 


986 


70 


507 


Y86265 


Homo sapiens 


Human secreted protein HUSXE77, SEQ ID 
NO:180. 


115 


33 


508 


AL160175 


Homo sapiens 


bA243J16.3 (similar to MYLK (myosin, light 
polypeptide kinase)) 


184 


92 


509 


U43360 


Peromyscus 
maniculatus 


reverse transcriptase 


97 


62 


510 


G03789 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7870. 


117 


63 


511 


W79092 


Homo sapiens 


Human secreted protein dn740_3. 


1058 


300 


512 


AF010144 


Homo sapiens 


neuronal thread protein AD7c-NTP 


205 


64 


513 


AJ133439 


Homo sapiens 


GRIP1 protein 


2151 


100 


514 


AE003456 


Drosophila 
melanogaster 


CG6393 gene product 


259 


42 . 


515 


Z17206 


Xenopus 
laevis 


p46XlEg22 


128 


40 


516 


AF 1044 13 1 Homo sapiens 


large tumor suppressor 1 


1766 


94 


517 


G03797 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7878. 


92 


40 


518 


AF151083 


Homo sapiens 


HSPC249 


444 


98 


519 


S80864 


Homo sapiens 


cytochrome c-like polypeptide 


318 


50 


520 


X92485 


Plasmodium 
vivax 


pval 


170 


61 



115 
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SEQ 
ID 

NO: 


Accession 
No. 


Species 


Description 


Smith- 
Waterman 
Score 


% 

identity 


521 


G03790 


Homo sapiens 


Human secreted protein, SEQ ID NU: 7871. 


1 SQ 


so 


522 


AF121857 


Homo sapiens 


sorting nexin 7 


Z jy 


/in 


523 


G02654 


Homo sapiens 


Human secreted protein, SEQ ID NU: 6735. 


oZ 


37 


524 


W88627 


Homo sapiens 


Secreted protein encoded by gene 94 clone 
HPMBQ32. 


T<i 

1j5 


73 


525 


API 19851 


Homo sapiens 


PRO 1722 


loz 


S7 


526 


Y27761 


Homo sapiens 


Human secreted protein encoded by gene No. 47. 


1 ^ A 


S7 


527 


G02707 


Homo sapiens 


Human secreted protein, SEQ ID NU: 67S8. 


/U 


A* 


528 


U47924 


Homo sapiens 


C8 


1 1 1 o 
1112 


B£ 
oO 


529 


G04063 


Homo sapiens 


Human secreted protein, SEQ ID NO: 8144. 


OA 

84 


A* 


530 


G03203 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7284. 


1 1 1 


oU 


531 


G04067 


Homo sapiens 


Human secreted protein, SEQ ID NO: 8148. 


92 


CO 


532 


G03267 


Homo sapiens 


Human secreted protein, SEQ ID NU: 7348. 


75 


oo 
Zy 


533 


G03203 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7284. 


182 


A Q 

4o 


534 


AF068286 


Homo sapiens 


HDCMD38P 


861 


100 


535 


U07707 


Homo sapiens 


epidermal growth factor receptor substrate 


228 


60 


536 


GO 1955 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6036. 


484 


75 


537 


AF2 19232 


Gallus gallus 


qin-induced kinase 


206 


53 


538 


AF 13 5022 


Homo sapiens 


mediator 


128 


100 I 


539 


G03267 


Homo sapiens 


Human secreted protein, SEQ ID NO: 734S. 


141 


59 


540 


AF016430 


Cacnorhabditi 
s elegans 


contains similarity to a BR-CV1 i K domain 


853 


39 


541 


AC003093 


Homo sapiens 


OXYSTEROL-BINDING PROTEIN; 45% 
similarity to P22059 (PID:gl29308) 


408 


"~66 


542 


M29487 


Homo sapiens 


integrin alpha subunit precursor 


517 


81 


543 


AF102530 


Mus 

musculus 


olfactory receptor F3 


327 


73 


544 


Y73431 


Homo sapiens 


Human secreted protein clone yM86_l protein 
sequence SEQ ID NO:84. 


386 


100 


545 


AE004833 


Pseudomonas 
aeruginosa 


probable TonB-dependent receptor 


279 


42 


546 


G03793 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7874. 


264 


53 


547 


Y69192 


Homo sapiens 


A human monocyte-macrophage apo lipoprotein 
B receptor protein. 


1772 


67 


548 


Y91493 


Homo sapiens 


Human secreted protein sequence encoded by 
gene 43 SEQ ID NO: 166. 


176 


100 


549 


G01571 


Homo sapiens 


Human secreted protein, SEQ ID NO: 5652. 


777 


99 


550 


AF044588 


Homo sapiens 


protein regulating cytokinesis 1; PRC1 


1953 


88 


551 


Y29332 


Homo sapiens 


Human secreted protein clone pe584_2 protein 
sequence. 


1224 


94 


552 


X98330 


Homo sapiens 


ryanodine receptor 2 


24621 


99 


553 


Y42782 


Homo sapiens 


Human UC Band #331 protein. 


684 


95 


554 


AB025258 


Mus 

musculus 


granuphilin-a 


501 


41 


555 


AJO 10346 


Homo sapiens 


RING-H2 


1468 


100 


556 


W92388 


Homo sapiens 


Human TR-interacting protein S239a. 


538 


92 


557 


AF119851 


Homo sapiens 


PRO 1722 


175 


59 


558 


AF 117756 


Homo sapiens 


thyroid hormone receptor-associated protein 
complex component TRAP150 


183 


32 


559 


G02872 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6953. 


319 


68 


560 


D86214 


Mus 

musculus 


Ca2+ dependent activator protein for secretion 


1010 


93 


561 


AF187325 


Canis 
familiaris 


melanoma antigen 


287 


55 


562 


AJ 00 1981 


Homo sapiens 


OXA1L 


2512 


~99 


563 


Z17238 


Rattus 
norvegicus 


glutamate receptor subtype delta- 1 


338 


66 


564 


W30638 


Homo sapiens 


Partial human 7- transmembrane receptor 
HAPOI67 protein. 


371 


100 


565 


AC005620 


Homo sapiens 


R33590 1 


467 


97 


566 


Y99358 


Homo sapiens 


Human PRO 1772 (UNQ834) ammo acid 
sequence SEQIDNO:63. 


1138 


78 


567 


AL031177 


Homo sapiens 


(U889M153 (novel protem) 


1002 


58 


568 


AF15I043 


Homo sapiens 


HSPC209 


798 


100 
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SEQ 
ID 

NO: 
569 


Accession 
No. 

AF097518 


Species 
Homo sapiens 


Description 

liver-specific transporter 


Smith- 
Waterman 

Score 
231 


% 

Identity 
100 


570 
571 

' 572 
573 
574 
575 


AB035698 
Y07096 

AL031177 
Y66639 
AB037108 
D43949 


Homo sapiens 
Homo sapiens 

Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 


MkshaDen/NIK-related kinase MINK-1 
Colon cancer associated antigen precursor 
sequence. 

dJ889M15.3 (novel protein) 
Membrane-bound protein PRO290. 
seven transmembrane domain orphan receptor 
This gene is novel. 


1532 
1064 

735 
254 
1883 
836 


100 
100 

55 
45 
99 
100 


576 
577 
578 
579 

580 ! 
581 


Y48596 
G00352 
R95913 
AK0251 16 
Y 86473 

AF 196779 


Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 

Homo sapiens 


Unman hrra<rt tumour- associated protein 57. 

Human secreted protein, SEQ ID NO: 4433. 

Neural thread protein. 

unnamed protein product 

Human gene 52-encoded protein fragment, SEQ 

IDNO:388. 

JM10 protein 


108 
141 
140 
201 
77 

450 


50 
75 
65 
70 
70 

100 


582 
583 


AF 1 88706 
AB030234 


Homo sapiens 

Canis 

familiaris 


g20 protein 

D4 dopamine receptor 

Human secreted protein, SEQ ID NO: 6702. 


330 
64 

345 


98 
56 

90 


584 
585 

586 


G02621 
AL096828 

Y30819 


Homo sapiens 
Homo sapiens 

Homo sapiens 


dJ963E22.1 ^INovei proiein birnuai «j « * *• 
Antigen) 

Human secretefl protein enuuucu u\>m gciiw 
Human secreted protein, SEQ ID NO: 4438. 


268 

235 
132 


85 

35 
56 


587 
588 
589 


G00357 
G02872 
AF235017 


Homo sapiens 
Homo sapiens 
Mus 

musculus 


Human secreted protein, bbg iu inu. oyn. 

2P1 protein 


182 
"764 


79 
80 


590 


W88627 


Homo sapiens 


Secreted protein encoded by gene 94 clone 
HPMBQ32. 


329 


81 


591 


Y30709 


Homo sapiens 


Amino acid sequence ot a numan secreiea 
protein. 


110 


43 


592 


Y53875 


Homo sapiens 


A human seven transmembrane signal transducer 
polypeptide. 


1369 


92 


593 
594 


Y53051 
Y27658 


Homo sapiens 
Homo sapiens 


Human secreted protein clone ddliyj* protein 

sequence SEQ ID NO: 108. 

Human secreted protein encoded by gene No. 92. 


1112 
763 


97 
79 


595 


G03798 
AF151110 


Homo sapiens 
Mus 

musculus 


Human secreted protein, SEQ ID NO: 7879. 
COP1 protein 

Human secreted protein, SEQ ID NO: 7867. 


156 
2215 

157 


58 
95 

65 


597 
598 

599 
600 


G03786 
AF192499 

AF1 19855 
G02872 


Homo sapiens 
Mus 

musculus 
Homo sapiens 
Homo sapiens 


putative secreted protein ZSIG37 
PRO 1847 

Human secreted protein, SEQ ID NO: 6953. 
Unman c Ar>r*»tf»ri nrr>trin encoded bv zene 38. 


143 

236 
212 
567 


40 

76 
73 
88 


601 
602 
603 
604 

605 
606 


Y00295 
AF184971 
AF061936 
AL096828 

AB033106 
X75756 


Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 

Homo sapiens 
Homo sapiens 


ri UiTi tin sccrcicu jjjuiv<ni v»iwiww»vi vj 

class II cytokine receptor ZCYTOR7 

diacylglycerol kinase iota 

dJ963E22.1 (Novel protein similar to NY-REN -2 

Antigen) 

KIAA1280 protein 
protein kinase C mu 


2015 

773 

1333 

3915 
3916 


74 
96 
93 

100 
99 


607 
608 
609 


D86983 
W69341 
W88627 


Homo sapiens 
Homo sapiens^ 
Homo sapiens 


^IrviMor ir\ V\ mf*\nr\c\otK*\e.r ruvny i rlflSI nf \J1 1052) 
S im 1 1 Bj XO l-'. Ill CI ell lV_rg|tio pui vajuuj«u^ w a « j 

Secreted protein of clone CG279 1. 
Secreted protein encoded by gene 94 clone 


5758 
1377 
339 


99 
99 
" 82 


610 
611 


Y27868 
AF202636 


Homo sapiens 
Homo sapiens 


Human secreted protein encoded by gene No. 
107. 

angiopoietin-like protein PP1 158 


116 
"^164 


62 
100 


612 
613 


AFO 90944 
Y02693 


Homo sapiens 
Homo sapiens 


PRO0663 

Human secreted protein encoded by gene 44 
clone HTDAD22. 


218 
195 


82 
59 


614 

615 
616 


M87053 

AC004232 
1 G01984 


Rattus 
norvegicus 
Homo sapiens 
Homo sapiens 


lens membrane protem 
FPM315 

" Human secreted protein, SEQ ID NO: 6065. 


450 

163 
205 


84 

37 
79 
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SEQ 
ID 

NO: 


Accession 
No. 


Species 


Description 


Smith- 
Waterman 
Score 


% 

Identity 


617 
618 


Y91524 
AJ245621 


Homo sapiens 
Homo sapiens 


Human secreted protein sequence encoded by 
gene /4 oHv iN ^'- 1 y ' ■ 
CTL2 protein 

Human secreted protein encoded by gene 75. 


821 

2258 
108 


99 

99 
64 


619 
620 
621 

622 


Y76198 

AF067864 

D90721 

W75858 


Homo sapiens 
Homo sapiens 
Escherichia 
coli 

Homo sapiens 


transferrin receptor 2 alpha 
Transmembrane protein dppC 

Human secretory protein of clone CS752-3. 


3922 
573 

730 


94 
90 

100 


623 
624 


Y94982 
AF034745 


Homo sapiens 
Mus 

musculus 


Human secreted protein vbl2 1, SEQ ID NO. 4. 
LNXp80 


733 
637 


100 
83 


625 
627 


U42580 

U79260 
R95913 


Paramecium ! 
bursaria 
Chlorella 
virus 1 

Homo sapiens 
Homo sapiens 


Pro-rich, IPPPNMSLPLS (3x) 
unknown 

Neural thread protein. 


"94 

194 

99 


46 

70 
50 


628 
629 
630 

631 


G03450 
Y36281 
Y02693 

G02139 


Homo sapiens 
Homo sapiens 
Homo sapiens 

Homo sapiens 


Human secreted protein, SEQ ID NO: 7531. 
Human secreted protein encoded by gene 58. 
Human secreted protein encoded by gene 44 
clone HTDAD22. 

Human secreted protein, SEQ ID NO: 6220. 


427 
590 
165 

268 
351 


100 
100 
76 

96 
80 


632 
633 
634 


U16996 

AF121857 

AF283772 


Homo sapiens 
Homo sapiens 
Homo sapiens 


protein tyrosine posphatase 
sorting nexin 7 

similar to Homo sapiens ribosomal protein Liu 
encoded by GenBank Accession Number 
L25899 


340 


100 
77 


635 
636 


Y07090 
ABO 13382 


Homo sapiens 1 
Homo sapiens 


Renal cancer associated antigen precursor 

sequence. 

DUSP6 


277 
414 


64 

76 


637 
638 


G02872 
M95762 


Homo sapiens 

Rattus 

norvegicus 


Human secreted protein, SEQ ID NO: 6953. 
GABA transporter 


315 
219 


71 
89 

60 


639 
640 


uuJ toy 
Y01400 


Urifnn ^aniens 

Homo sapiens 


Human secreted protein, SEQ ID NO: 7870. 
Secreted protein encoded by gene 18 clone 
HNHF029. 


137 


79 


641 


AC008075 


Arabidopsis 
thaliana 


F24J5.4 


121 


33 


642 


W74824 


Homo sapiens 


Human secreted protein encoded by gene 96 
clone HAQBK61. 


615 
485 


62 
98 


643 
644 

645 
646 
647 


ABO 15982 
Y25806 

AF 122904 
AF233323 
W48804 


Homo sapiens 
Homo sapiens 

Homo sapiens 
Homo sapiens 
Homo sapiens 


serine/threonine kinase 

Human secreted protcm fragment encoded from 
gene 23. 

membrane protein DAP 10 

Fas-associated phosphatase- 1 

Homo sapiens clone BK158 1 protein. 


162 

474 
200 
1203 


46 

100 

38 

99 


OHO 

649 
650 
651 


AF257330 
Y36203 
G02872 
Y32199 


Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 


COBW-like protein 

Human secreted protein #75. 

Human secreted protein, SEQ ID NO: 6953. 

Human receptor molecule ^Ktu; encoacu uy 

Incytc clone 2022379. 


1440 
233 
173 
1012 


98 

73 i 

78 

100 


652 


AB032909 


Hylobales 
agilis 


dopamine receptor D4 


122 
186 


32 
69 


653 
654 


AK021848 
W73411 


Homo sapiens 
Homo sapiens 


unnamed protein product 

Human secreted protein encoded by Gene No. 

15. 


57 


37 


655 

OJU 


L22455 
G031 12 


Rattus 
norvegicus 
Homo sapiens 


mu opioid receptor 

Human secreted protein, SEQ ID NO: 7193. 


116 

no 


34 
45 


657 
658 


G02345 
W88627 


Homo sapiens 
Homo sapiens 


Human secreted protein. SEQ ID NO: 6426 
Secreted protein encoded by gene 94 clone 
HPMBQ32. 

Human secreted protein, SEQ ID NO: 6913. 


459 
291 

134 


97 
75 

65 


659 
660 


G02832 
Y91423 


Homo sapiens 
Homo sapiens 


Human secreted protein sequence encoded by 
gene 11 SEQ IDNO:U4. 


333 


96 
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SEQ 
ID 

NO: 


Accession I 
So. 


species 


Description 

Human secreted protein, SEQ ID NO: 7870. 


Smith- 
Waterman 
Score 
168 


0/ 

To 

Identity 
68 


661 


G03789 
Y53886 


Homo sapiens 
Homo sapiens 


A suppressor of cytokine signalling protein 

designated HSCOP-6. 

Human GTP binding protein APD08. 


375 
629 


43 
100 


663 
664 

665 


W75771 
AL096770 

AB037734 


Homo sapiens 
Homo sapiens 

Homo sapiens 


bA150A6.2 (novel 7 transmembrane receptor 
/r^n^rtocin familv^ (olfactorv receptor like) 
protein (hs6M 1-21)) 

rvLAAljl^ proicui . , — 


480 

978 
192 


55 

96 
84 


666 
667 
668 


W82841 
W82841 
AB030184 


Homo sapiens 
Homo sapiens 
Mus 

musculus 


Human cerebral protein- 1. 

Human cerebral protein- 1. 

contains transmembrane (TM) region and ATP 

binding region 


182 

757 


87 
68 


669 


AB032919 


Hylobaies 
muelleri 


dopamine receptor D4 


85 


37 
81 


670 

671 

672 
673 


AF107295 

Z33642 
W85608 
G03203 


Rattus 
norvegicus 
Homo sapiens 
Homo sapiens 
Homo sapiens 


outer membrane protein 

leukocyte surface protein 
Secreted protein cionc uu*»iu j. 
" Human secreted protein, SEQ ID NO: 7284. 


746 

394 
261 
106 


93 
91 
48 


cnA 
o /4 

675 
676 
677 


at rnssR7 
Y59668 
G03797 
AF026954 


Homo sapiens 
Homo sapiens 
Homo sapiens 
Bos taurus 


"dJ475Nl6.4(KIAA0240) 
Secreted protein 108-005-5-0-C1-FL. 

"Human secreted protein, SEQ ID NO: 7878. 
"pyruvate dehydrogenase phosphatase regulatory 
subunit precursor; PDPr 


2388 
1134 
174 
1013 


99 
53 
74 
95 

~96~ 


678 


L11625 


Mus 

musculus 


receptor protein-tyrosine kinase 
dJ167A193 (novel protein) 


545 
745 


100 


679 
680 

681 


AL031427 
AJ1 33430 

nrv?^T? 


Homo sapiens 
Mus 

musculus 
Homo sapiens 


olfactory receptor 

Human secreted protein, SEQ ID NO: 6613. 


528 
179 


77 
70 


682 
683 


G03789 
Y94943 


Homo sapiens 
Homo sapiens 


'Human secreted protein, SEQ ED NO: 7870. 
Human secreted protein clone ytl4J protein 
sequence SEQIDNO:92. 


336 
118 


76 
100 


684 
685 


U43360 
G00885 


Peromyscus 
maniculatus 
Homo sapiens 


reverse transcriptase 

- T ■ — .j cpn ID "KID - 4066 

Human secreted protein, ot-v LLJ 1NU> 


100 

162 
590 


37 

60 
100 


686 
687 
688 


AK001518 

G01982 

Y92241 


Homo sapiens 
Homo sapiens 
Homo sapiens 


unnamed protein product 

— — ' . CCA Tp) "NO 606^ 

Human secreted protein, iu inu. ovuj- 
Human cancer associated antigen precursor 
(MO-REN-46). 


718 
2405 


100 
99 


689 


AC024792 


Caenorhabditi 
s elrpans 


contains similarity to TR.P78316 


423 


36 


690 


Y27868 


Homo sapiens 


Human secreted protein encoded by gene No. 

107 


183 


81 


691 

692 
693 


Y56514 

Y27795 
Y36268 


Homo sapiens 

Homo sapiens 
Homo sapiens 


" Human Jurkat cell clone F2-15 AIM 10 longest 

ORF protein sequence. _ 

tj,,„„_ cf>rrf*if*f\ nmtfin encoded bv Eene No. 79. 
Human secreted protein encoded by gene 45. 


180 

1539 

428 

308 


88 

99 
98 
89 


694 
695 
696 
697 


U 12465 
Y45272 
AJF191838 
Y02693 


Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 


ribosomal protein L35 

Human secreted protein encoded from gene 16. 
TANK binding kinase TBKl 
Human secreieo pruieui chluucu 
clone HTDAD22 


1517 
1242 
275 


99 
98 

75 


698 


Y87280 


Homo sapiens 


Human signal peptide containing protein HSPP- 
57SEQIDNO:57. 


576 


90 


699 
700 


Y97999 
AJ00670] 


Homo sapiens 
Homo sapiens 


" Human SCAD family molecule HSFM-1 , SEQ 
IDNO:l. 

putative serine/threonine protein kinase 


729 
610 


99 
79 


701 

702 


AF209198 
AJ298841 


Homo sapiens 
Mus 

musculus 


zinc finger protein 277 
torsinA protein 


2357 
709 

622 


100 
98 


703 
704 

705 


AK021729 
Z46787 

G02882 


Homo sapiens 
Caenorhabditi 
s elegans 
Homo sapiens 


unnamed protein product 

similar to Glutaredoxin, Zinc finger, C3HC4 

type (RING finger) 

Human secreted protein, SEQ ID NU: W*>3. 


920 
589 


51 
98 
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SEQ 

ID 1 

NO: 


\ccession 5 


species j 1 

• ] 


Description 

Human secreted protein, 1L> NO. 6j1U. 


JlUlUJ 

Waterman 

Score 

125 


% 

Identity 
58 


706 
707 

708 


302501 
R95326 

G03002 


-lomo sapiens i 
Homo sapiens 

Homo sapiens 


Tumor necrosis factor receptor I death domain 
lieand (clone 2DD). 

Human secreted protein, SEQ ID NO: 708:>. 


121 
125 


95 
39 


709 
710 


Y96202 
M63577 


Homo sapiens 
Saccharomyc 
es cerevisiae 


IkappaB kinase (IKK) binding protein, Y2HM>. 
SFP1 


516 
131 


98 
59 


711 


AB026291 


Rattus 
norvegicus 


nretoacervl-CoA synthetase 

protein tyrosine phosphatase (PTP-BAS, type 3) 


467 
368 


85 
44 


712 
713 

714 
715 


D212U 
AF044033 

G03561 
AB033062 


Homo sapiens 
Marmota 
marmota 
Homo sapiens 
Homo sapiens 


olfactory receptor 

Human secreted protein, SEQ ID NO: 7642. 
K1AA1236 protein 

Human secreted protein, SEQ ID NO; 4658. 


615 

251 

1380 

80 


83 

100 
100 
73 


716 
717 
718 
710 


G00577 
Y96864 
AJ243396 j 
U47334 


Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 


cpn in from WO0034474. 
voltage-gated sodium channel beta-3 subunit 
similar to chicken gamma aminobutyric acid 
receptor bcta4 subunit 


835 
234 
578 

1096 


99 

100 

99 

100 


720 
721 


AB020598 
Y53886 


Homo sapiens 
Homo sapiens 


peptide transporter 3 

A suppressor ot cytoKine signalling 

designated HSCOP-6. 

insulin receptor- related receptor , 


570 
6787 


74 
100 


ill 
723 


J05046 
AF001958 


Homo sapiens 

Ambystoma 

tigrinum 


electrogenic Na+ bicarbonate cotransporter, 
NBC 


111 


41 


TO A 

724 




Mus 

musculus 


semaphorin cytoplasmic domain-associated 
protein 3A ____ 


5253 
3114 


94 
"99 


725 

11 £> 


X54673 
AF016191 


Homo sapiens 

Rattus 

norvegicus 


GABA transporter 
potassium channel 


370 


100 

35 


727 
728 


AB029559 
Y28503 


Rattus 
norvegicus 
Homo sapiens 


BAT1 

HGFH3 Human Growth Factor Homologue 3. 


139 
2186 


y / 


729 


AJ011415 
293096 


Homo sapiens 
Homo sapiens 


plexin-Bl /SEP receptor 

bK390B3.1 (manic fringe (Drosophila) 

homolog) 


729 
142 


56 
68 


731 
732 


Z10062 
AF161382 


Homo sapiens 
Homo sapiens 


cDNA encoding a human vanilioid receptor 

homologue Vanilrepl. 

HSPC264 


675 
492 


99 
y4 


733 
734 


AB029033 
AE000493 


Homo sapiens 

Escherichia 

coli 


K1AA1 1 10 protein 
putative transport protein 


3826 
592 


99 
97 

"99 


735 


AL033379 


Homo sapiens 


dJ41 7022.2 (novel 7 transmembrane receptor 
(rhodopsin family) protein similar to high- 
affinity lysophosphatidic acid receptor homolog) 


2173 




736 
737 


AF 132599 
X55019 


Homo sapiens 
Homo sapiens 


" RANTES factor of late activated T lymphocytes- 
l 

acetylcholine receptor delta subunit 
voltage-gated chloride ion channel 


245 
1978 


56 
100 


738 
739 
740 

741 
742 


X91906 

AB026116 

D00570 

W03626 
U66059 


Homo sapiens 
Homo sapiens 
Mus 

musculus 
Homo sapiens 
Homo sapiens 


organic anion transporter 4 

open reading frame (196 AA) 

Human thyrotropin GPR N-termmai sequence. 
V segment translation product 


1444 

83 

118 
614 


98 
24 

40 
100 


743 
744 
745 


AF1 19815 

X16663 

W67838 


Homo sapiens 
Homo sapiens 
Homo sapiens 


G-protein-coupled receptor m 

haematopoietic lineage cell protein (AA M86) 
Human secreted protein encoded by gene 32 
clone HLTCJ63. 

Unman ^£maDhorir) Y. 


148 
448 

2414 


99 
93 
95 

100 


746 
747 


W57260 
W21578 


Homo sapiens 
Homo sapiens 


Alzheimer's disease protein encoded by DNA 
from plasm id pGCS2232. . 


968 


65 


748 

749 
750 


Y94935 

AL022238 
G03889 


Homo sapiens 

Homo sapiens 
Homo sapiens 


Human secreted protein clone yd218_l protein 

sequence SEQ ID NO:76. 

dJ3042K10.5 (novel protein) 

Human secreted protein, SEQ ID NO: 797U. 


622 

314 
391 


100 

85 
87 



120 



WO 01/57188 
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SEQ 
ID 

NO: 


Accession 
No. 


Species 


Description 


Smith- 
Waterman 
Score 


% 

Identity 


751 
752 


AB025258 
Y52386 


Mus 

musculus 
Homo sapiens 


granuphilin-a 

Human transmembrane protein mtuzvaaj. 
Human breast tumour-associated protein 47. 


773 

900 
2527 


41 

99 
99 


753 
754 
755 


Y48586 
M85183 


Homo sapiens 
Homo saniens 
Rattus 
norvegicus 


putative G protein-coupled receptor 92 
vasopressin receptor 


694 
979 


100 
68 


756 


AF190501 


Homo sapiens 


leucine-rich repeat-containing G protein-coupled 
receptor 6 


388 


71 


757 


Y02692 


Homo sapiens 


Human secreted protein encoded by gene 43 
clone HTADX 17. 


461 
439 


87 
98 


758 
759 


Z22535 
R04932 


Homo sapiens 
Homo sapiens 


ALK-3 

Interferon-gamma receptor segment from clone 
39 responsiblefor binding the target 


564 


97 


760 

761 
762 


W74902 

G03706 
AB020676 


Homo sapiens 

Homo sapiens 
Homo sapiens 


Human secreted protein encoded by gene 175 
clone HE8BI92. 

Human secreted protein, SEQ ID NO: 7787. 
K1AA0869 protein 


1217 

223 
4433 


99 

88 
99 


763 
764 
765 


AK026992 
AF173358 
AF268066 


Homo sapiens 
Homo sapiens 
Mus 

musculus 


unnamed protein product 

glucocorticoid receptor AF-1 coactivator-1 

netrin 4 

Human breast tumour-associated protein 46. 


573 
2019 

1169 


99 

100 

89 

89 


766 
767 


Y48585 
AF230378 


Homo sapiens 
Mus 

musculus 




309 


45 


768 


AF121975 


Mus 

musculus 


odorant receptor S 1 8 


268 
611 


62 
57 


769 
770 


AB008515 
Y09945 


Homo sapiens 

Rattus 

norvegicus 


RanBPM 

putative integral membrane transport protein 


458 
688 


50 
99 


771 
772 


AF226731 
Y27132 


Homo sapiens 
Homo sapiens 


AD026 

" Human glioblastoma- derived polypeptide (clone 
OA004FG). 


1384 
1821 


100 
98 


773 
774 

775 
776 


X87832 
AB025258 

AF125101 
G02815 


Homo sapiens 
Mus 

musculus 
Homo sapiens 
Homo sapiens 


NOV/plexin-Al protein 
granuphilin-a 

HSPC040 protein 

Human secreted protein, SEQ ID NO: 6896. 
Human secreted protein, SEQ ID NO: 6574. 


500 

314 
191 


41 

93 
95 
68 


777 
778 
779 
780 


R03301 
AL357374 
AF 100346 


nUmu aajJicjis 

Homo sapiens 
Homo sapiens 
Homo sapiens 


Sequence of pre-human atrial natriuretic peptide. 
bA353C18.2 (novel protein) 
neuronal voltage gated calcium channel gamma- 
3 subunit 


213 
232 


45 

100 

89 


781 


Y19566 


Homo sapiens 


Amino acid sequence of a human secreted 
protein. 

Human secreted protein encoded by gene 10. 


103 
1098 


52 
93 


782 
783 


Y36233 
AF084464 


Homo sapiens 

Rattus 

norvegicus 


GTP-binding protein REM2 


141 


30 


784 
785 


W49042 
AF238381 


Homo sapiens 
Homo sapiens 


Human low density lipoprotein binding protein 

LBP-3. 

PTOV1 


2693 

1904 
547 


99 

91 
100 


786 
787 
788 

789 
790 


Y91870 
Y71062 
AFU7754 

AL049569 
AF151848 


Homo sapiens 
Homo sapiens 
Homo sapiens 

Homo sapiens 
Homo sapiens 


Human apoptosis related protein. 
Human membrane transport protein, M 1 KP- /. 
thyroid hormone receptor-associated protein 
complex component TRAP240 
" dJ37C10.3 (novel ATPase) 
CGI-90 protein 


1062 
8684 

2848 
745 


94 

98 

96 
96 


791 
792 
793 


Y08639 
Y41 /Uo 
AF121228 


Homo sapiens 
Homo sapiens 


nuclear orphan receptor ROR-bcta 
Human PR038 1 protein sequence, 
thyroid hormone receptor-associated protein 
complex component TRAP95 


1421 

644 

1037 

124 


95 
99 
100 

62 


794 

795 

796 


G04072 
Y69384 

W40215 


Homo sapiens 
Homo sapiens 

Homo sapiens 


Human secreted protein, SEQ ID NO: 8153. 
Amino acid sequence of a 14274 receptor 
protein. 

Human macrophage antigen 


119 
1358 


100 

99 



121 
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SEQ 

ID 

NO: 

797 

798 


Accession 
No. 

AF258340 
AF159615 


Species j 

Homo sapiens 
Homo sapiens 


Description 

hepatocellular carcinoma-associated antigen 112 

FGF receptor activating protein 1 

Human normal uterus tissue derived protein 26. 


Smith- 
Waterman 
Score 
1151 
461 
797 


% 

Identity 

99 
98 
99 


799 
800 
801 
802 

803 
804 


Y 59863 
W70459 
L00073 
P92219 

X15357 
W64473 


Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 
(human) 
Homo sapiens 
Homo sapiens 


Human Tl-receptor ligand lH splice variant 2. 
renin 

CR1 protein. 

ANP-A receptor preprotein (AA -32 to 1029) 
Human secreted protein from clone EC172_1. 


<TO 

1913 
11963 

5199 
4018 
2067 


93 
97 

98 

100 


805 
806 
807 
808 


AJ243874 
G01731 
Z24680 
AF171669 


Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 


oligophrenin-4 

Human secreted protein, SEQ ID NO: 5812. 

gt"? . : ■ 

glycoprotein-associated ammo acid transporter 

LAT2 


284 

1562 

1364 

1154 


100 

83 

90 

"96 


809 
810 

811 


W70321 
W74843 

AF 108831 


Homo sapiens 
Homo sapiens 

Homo sapiens 


Secreted protein CC 198 1. 

Human secreted protein encoded by gene 1 1 5 

clone huvdau^. 

K:CI cotransporter 3 


855 
4561 


99 
100 


812 
813 

814 


AF092135 
AF283772 

G01563 


Homo sapiens 
Homo sapiens 

Homo sapiens 


PTD014 

similar to Homo sapiens ribosomal protein L10 
encoded by GenBank Accession Number 
L25899 

Human secreieu protein, ocy m -"^ » ■■ 


862 
784 

330 


100 
100 

100 


815 
816 

817 
818 
819 


AF051151 
W95630 

GO 1082 

AF151800 

L00352 


Homo sapiens 
Homo sapiens 

Homo sapiens 
Homo sapiens 
Homo sapiens 


Toll/interleukin-1 receptor-like protein 3 
Homo sapiens secreted protein gene clone 
gnll4 1. 

Human secreted protein, SEQ ID NO: 5163. 
CGI-41 protein 

low densirv lipoprotein receptor 


3850 
358 

549 

1106 

3980 


99 
100 

100 

95 

100 


820 
821 
822 
823 


X04434 
G03844 
AF2 12220 
Y50125 


Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 


IGF-I receptor 

Human secreted protein, otv iv nv. jy*~>. 
TERA 

Human glycophosphatidylinositol-anchored 

pro tern OPl-lZz. 


5832 
572 
396 
4897 

2675 


99 
100 
48 
99 

98 


824 
825 


AF156778 
AF096322 


Homo sapiens 
Homo sapiens 


ASB-3 protein 

neuronal voltage-gated calcium channel gamma- 
2 subunit 


1105 


100 


826 
827 


Y07972 
AB032013 


Homo sapiens 
Homo sapiens 


Human secreted protein fragment #2 encoded 
from gene 28. 
potassium channel Kv8.1 


1540 
2435 


100 
95 


828 
829 

830 
831 
832 


Y 13620 
Y91474 

X54232 
X14830 
Y71262 


Homo sapiens 
Homo sapiens 

Homo sapiens 
Homo sapiens 
Homo sapiens 


BCL9 

Human secreted protein sequence encoded by 
gene 24 SEQH)NO:147. 

glypican 

o^»<^/i/^hr»i inp rprrntnr heta-subunit DreDrotein 
Human chondromodul in-like protein, Zchml. 


5284 
541 

1625 
2540 
1002 


96 
98 

87 
100 
100 
96 


833 
834 
835 
836 


G03873 
AC003030 
Y38422 
U41557 


Homo sapiens 
Homo sapiens 
Homo sapiens 
Caenorhabditi 
s elegans 


" 'Human secreted protein, SEQ ID NO: 7954. 
R29828 1 

Human secreted protein, 
grycine-rich 


638 
1389 
964 
85 


93 
87 
36 


837 

838 
839 


AL121889 

AJ011415 
W80398 


Homo sapiens 

Homo sapiens 
Homo sapiens 


dJ1076E17.1 (KIAA0823 protem (continues in 

AL023803)) 

plexin-Bl/SEP receptor 

A secreted protein encoded by clone cwl 543 3. 
" Human secreted protein, SEQ ID NO: 4943. 


998 

1580 
1105 
255 


75 

60 
67 
92 


840 
841 
842 
843 

844 
845 
846 
847 


G00862 
G02650 
AF036717 
Y73446 

G02872 
AF151810 
X83378 
AC004883 


Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 

Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 


Human secreted protein, SEQ ID NO: 6731. 
FGFR signalling adaptor SNT-1 
Human secreted protein clone yc27_l protein 
sequence SEQ ID NO. 1 14. 
" Human secreted protein, SEQ ID NO: 6953. 
CGI-52 protein 
putative chloride channel 
similar to general transcription factor 21; similar 


644 

2629 

1089 

357 
1443 
1620 
655 


97 
99 
100 

69 
88 
99 
96 
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SEQ 
ID 

NO: 


Accession 
No. 


Species 


Description 


Smith- 
Waterman 
Score 


% 

Identity 








to AF038969 (P ID :g2 827207) 






848 


X99886 


Homo sapiens 


monocyte chemotactic protein-2 


160 


76 


849 


AC005587 


Homo sapiens 


similar to mouse oliactory receptor u, similar to 

H1AGQA fOXV\-rvA A.A'i(\^,\ 


963 


98 


850 


AB038237 


Homo sapiens 


G protein-coupled receptor C5L2 


1767 


100 


851 


AF 124490 


Homo sapiens 


ARF GTPase- activating protein GIT1 


3415 


98 


852 


Y86217 


Homo sapiens 


Human secreted protein hwhuujh, id 
NO:132. 


1 IRQ 


99 


853 


AF224741 


Homo sapiens 


chloride channel protein 7 


3748 


99 


854 


XI 7094 


Homo sapiens 


runn (AA 1-7^4) _^ 


3550 


99 


855 


W78245 


Homo sapiens 


Fragment of human secreted protein encoded by 
gene 19. 


1245 


99 


856 


R97569 


Homo sapiens 


Interleukin-2 receptor associated protein p43. 




100 


857 


Y41765 


Homo sapiens 


Human PRO1083 protein sequence. 


3211 


99 


858 


AF057306 


Homo sapiens 


transmembrane proteolipid 






859 


AK025116 


Homo sapiens 


unnamed protein product 




60 


860 


Y41312 


Homo sapiens 


Human secreted protein encoded by gene 5 clone 
HLDRM43. 


OV4 




862 


Y25776 


Homo sapiens 


Human secreted protein encoded from gene 66. 




QO 

yy i 


863 


Y74188 


Homo sapiens 


Human prostate tumor EST fragment derived 
protein #375. 


96 


30 


864 


AF 167473 


Homo sapiens 


heme-binding protein 


O IKJ 


QQ 

yy 


865 


G02532 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6613. 


2U 


67 


866 


X54870 


Homo sapiens 


Type II integral membrane protein 


1201 


1 HTi 

1 kAj 


867 


G00700 


Homo sapiens 


Human secreted protein, SEQ ED NO: 4781. 


640 


yy 


868 


Y07894 


Homo sapiens 


Human secreted protein fragment encoded from 
gene 43. 


388 


QQ 


869 


J00123 


Homo sapiens 


prcprocnkephalin ( 


1 "2 A □ 


05 


870 


Y91632 


Homo sapiens 


Human secreted protein sequence encoded by 
gene 25 SEQ ID NO:305. 


i r\A o 
l04o 


OS 
yo 


871 


L04311 


Homo sapiens 


GABA-alpha receptor beta-3 subunit 


Oil 


yo 


872 


Y29988 


Homo sapiens 


Human cytokine family member EF-7 protein. 


960 


94 


873 


AF161382 


Homo sapiens 


HSPC264 




00 
yy 


874 


G03412 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7493. 


464 


100 


875 


Y27572 


Homo sapiens 


Human secreted protein encoded by gene No. 6. 


573 


Q£ 
yO 


876 


M15530 


Homo sapiens 


B-cell growth factor 


171 


jO 


877 


W63681 


Homo sapiens 


Human secreted protein 1 . 


1652 


QQ 
yy 


878 


L27867 


Rattus 
norvegicus 


neurexophilin 


1 A A O 

1 445 


OR 

yo 


879 


Y10835 


Homo sapiens 


Amino acid sequence of a human secreted 
protein. 


321 


100 


880 


W88991 


Homo sapiens 


Polypeptide fragment encoded by gene 144. 




100 


881 


AF 11 8670 


Homo sapiens 


orphan G protein-coupled receptor 


1 Q7I 

17/1 


100 


882 


AF208865 


Homo sapiens 


EDRF 


JZo 


100 


883 


Y18462 


Homo sapiens 


cathepsin L 


209 


72 


884 


Y94950 


Homo sapiens 


Human secreted protein clone dhl073_I2 protein 
sequence bbQ ID NO. ILK). 




100 


885 


AF070661 


Homo sapiens 


HSPC005 


404 


100 


886 


Y04315 


Homo sapiens 


Human secreted protein encoded by gene 23. 


385 


100 


887 


X92744 


Homo sapiens 


hBD-1 


375 


100 


888 


Y22496 


Homo sapiens 


Human secreted protein sequence clone 
cn621 8. 


994 


94 


889 


Y41293 


Homo sapiens 


Human soluble protein L. i Mru- 1 


4595 


99 


890 


G03714 


Homo sapiens 


Human secreted protein, oty ajl^ jnu. / /yj. 


147 


63 


891 


AF208856 


Homo sapiens 


BM-0 1 4 


1012 


99 


892 


U29195 


Homo sapiens 


neuronal pentraxin 11 


2002 


98 


893 


X68149 


Homo sapiens 


Burkitt lymphoma receptor 1 




100 




I yt+yl 1 * 




Human secreted protein clone pw337_6 protein 
sequence SEQ ID NO:34. 


537 


100 


895 


W61630 


Homo sapiens 


Clone HNFGW06 of EGFR receptor family. 


326 


63 


896 


M24110 


Homo sapiens 


GOS19-2 peptide precursor 


481 


100 


897 


Z68747 


Homo sapiens 


imogen 38 


2018 


99 


898 


AF186112 


Homo sapiens 


neurokinin B-like protein ZNEUROK1 


619 


100 


899 


AF225420 


Homo sapiens 


AD025 


734 


100 



123 



WO 01/57188 



PCT/USO 1/03800 



SEQ 
ID 

NO: 



Accession 
No. 



Species 



Description 



Smith- 
Waterman 
Score 



Identity 



900 



901 



902 



P60657 



Homo sapiens 



Sequence of human lipocortin. 



M27288 



Homo sapiens 



oncostatin M 



W85737 



Homo sapiens 



Polypeptide with transmembrane domain. 



GO 1349 



Homo sapiens 



YO0261 



Homo sapiens 



Human secreted protein, SEQ ID NO: 5430. 
Human secreted protein encoded by gene 4. 



1297 



749 



650 



1133 



99 



100 
99 



99 
99 



AF039688 



Homo sapiens 



antigen NY -C Q-3 



AB007836 



Homo sapiens 



Hic-5 



2544 



AB017507 



Homo sapiens 



Apgl2 



224 



AK000056 



Homo sapiens 



unnamed protein product 



1537 



Y86299 



Homo sapiens 



Human secreted protein HFOXB55, SEQ ID 
NO:214. 



427 



AF23I023 



Homo sapiens 



protocadherin Flamingo 1 



7393 



Y14134 



Homo sapiens 



Vascular endothelial cell growth inhibitor beta 
protein sequence^ 



1319 



Z90420 



Homo sapiens 



Y19757 



G03172 



U14971 



AF 1 72854 



AC005525 



Homo sapiens 
Homo sapiens 



Homo sapiens 
Homo sapiens 



AF 166350 



Y87285 



Y36131 



AF193766 



Y95013 



Homo sapiens 
Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Human GDF-3 (hGDF-3) polypeptide encoding 
cDNA. 



1950 



SEQ ID NO 475 from WQ9922243. 



1361 



Human secreted protein, SEQ ID NO: 7253. 
ribosomal protein S9 



112 



cardiotrophin-like cytokine CLC 



886 
1204 



F22162 1 



1963 



ST7 protein 



4711 



Human signal peptide containing protein HSPP- 
62 SEQ ID NO:62. 



430 



Human secreted protein #3. 
cytokine-like protein CI 7 



465 



724 



Human secreted protein vc48^1, SEQ ID NO:66. 



357 



100 



100 



98 
100 



99 
100 



100 



100 



48 



90 



99 



100 
99 



100 



88 



100 



100 
100 



X75208 



Homo sapiens 



Y96202 



Homo sapiens 



AB039886 



Homo sapiens 



protein tyrosine kinase-receptor 



5256 



IkappaB kinase (IKK) binding protein, Y2H56. 



813 



down-regulated in gastric cancer 



785 



98 
78 



G03368 



Homo sapiens 



Human secreted protein, SEQ ID NO: 7449~ 



Y48606 



Homo sapiens 



Human breast tumour-associated protein 67 



539 



Y36151 



AF 110399 



AF210317 



Y73328 



Homo sapiens 



Human secreted protein #23. 



668 



Homo sapiens 



elongation factor Ts 



Homo sapiens 



facilitative glucose transporter family member 
GLUT9 _ 



1666 
2763 



Homo sapiens 



HTRM clone 082843 protein sequence. 



931 
274 



100 



100 



100 
99 



100 
100 



G01959 



Homo sapiens 



U47924 



Homo sapiens 



Human secreted protein, SEQ ID NO: 60407 
B-cell receptor associated protein 



1469 



G03827 



Homo sapiens^ 



Human secreted protein, SEQ ID NO: 7908. 



529 



AB039371 



Homo sapiens 



mitochondrial ABC transporter 3 



X56385 



Canis 
familiaris 



rab8 



196 
1064 



B08906 



Homo sapiens 



Human secreted protein sequence encoded by 
gene 16 SEQ ID NO:63. _ 



117 



Ml 3692 



Homo sapiens 



alpha-) acid glycoprotein precursor 



1064 



Y53886 



Homo sapiens 



A suppressor of cytokine signalling protein 
designated HSCOP-6- 



515 



Y16630 



Homo sapiens 



Human Putative Adrenomedullin Receptor 
(PAR). 



1904 



AC005102 



Homo sapiens 



small inducible cytokine subfamily A member 
24 



627 



M12886 



Homo sapiens 



T-cell receptor beta chain 



1289 



AF226046 



Homo sapiens 



Y36078 



Homo sapiens 



GK003 

Extended human secreted protein sequence, SEQ 
ID NO. 463. 



1049 



667 



100 



93 



63 
100 



44 

"99" 



42 



99 



"99" 



81 



98 



100 



M22877 



Homo sapiens 



W67869 



Homo sapiens 



cytoc hrome c 

Human secreted protein encoded by gene 63 
clone HHGDB72. 



565 



551 



W67859 



Homo sapiens 



Human secreted protein encoded by gene 53 
clone HBMCL41 



283 



W85726 



Homo sapiens 



Novel protein (Clone BG33J7)T 



789 



AJ242015 



Homo sapiens 



eMDC II protein 



4236 



G04075 



Homo sapiens 



"Human secreted protein, SEQ ID NO: 8156^ 



567 



100 
93 



100 



100 



100 



99 
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SEQ 
ID 

NO: 


Accession 
No. 


Species 


. — ■ | 

Description 

candidate tumor suppressor pi J ING1 homolog 


Smith- I ■ 

Waterman 

Score 

1314 


dentity 
100 


951 1 
952 

953 
954 
955 


AF 110645 
Y36111 

AB012109 
AF 246221 
AF054986 


Homo sapiens 
Homo sapiens 

Homo sapiens 
Homo sapiens 
Homo sapiens 


Extended human secreted protein sequence, SEQ 

TP 40* 

APC10 

transmembrane protein BRJ 

putative transmembrane (J 1 Pase 


402 

990 , 

1405 

1883 


70 

100 
100 
100 
100 


956 
957 
958 
959 

960 
961 


W74726 
Y27096 
AJ222967 
Y53052 

G02694 
AF151855 


T-Jrimn can 1 CD S 

Homo sapiens 
Homo sapiens 
Homo sapiens 

Homo sapiens 
Homo sapiens 


Human secreted protein fg949 3. 
Human viral receptor protein (ACVRF). 

cystinosin , 

"Human secreted protein clone dt202_i protein 
sequence SEQ ID NO: 110. 
Human secreted protein, s£0 ID NO: 6775. 
CG1-97 protein 

diabetes mellitus type 1 autoantigen 


1879 
1581 
1920 
587 

283 

1214 

250 


100 
100 
100 

100 

96 
65 


962 
963 
964 
965 

966 
967 
968 


U26592 
AL050306 
AF078859 
AB020315 

X04571 _J 

AF146019 

AF071002 


Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 

Homo sapiens 
Homo sapiens 
Homo sapiens 


dJ475B7.2 (novel protein) 

PTD004 

homologue of mouse dkk-1 gene:Acc# 
AF030433 

precursor polypeptide (AA -22 to 1 1 85) 
hepatocellular carcinoma antigen gene 520 
minK-related peptide 1; MiRPl 
mcmbranc-type-5 matrix metal loproteinasc 


3796 
2089 
1466 

6580 
993 
632 
3545 


100 
100 
100 

99 
99 
100 
100 


970 
971 
972 

974 


AB021227 
API 80920 
AF 105365 
AF083248 
All 32429 

W61619 
AF155100 


Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 

Homo sapiens 
Homo sapiens 


cyclin L ania-6a 

K-Cl cotransporter KCC4 

ribosomal protein L26 homolog 

hyperpolarization-activated cyclic nucleotide 

gated cation channel hHCN4 
"Clone HTPEF86 of TM4SF superfamily. 
"zinc finger protein NY-RgNjl antigen 


1 *?70 

IJ IS 

5621 
739 
6295 

454 

2261 

1 1 HC1 


100 
99 
100 
100 

100 
100 
QQ 


y /o 
976 
977 
978 


AVI 1 — ' * w 

AF275948 
AB026891 
AF 117657 


Homo sapiens 
Homo sapiens 
Homo sapiens 


ABCA1 

cystine/glutamate transporter 

thyroid hormone receptor-associated protein 

complex component TRAP80 


1 1 /oJ 

2552 

3348 

i znc\ 


100 
99 

92 


979 


AF044201 
AF 1 19297 


Rattus 
norvegicus 
Homo sapiens 


neural membrane protein 35; NMF35 
neuroendocrine-specific protein-like protein 1 


15 f\) 
1170 


99 
99 


980 

QR 1 

982 


API 55652 
W88499 

Z56281 


Homo sapiens 
Homo sapiens 

Homo sapiens 


~nr>tassium channel modulatory factor 
Human stomach carcinoma clone hfiwiz- 

encoded protein. _ _ 

interferon regulatory tactor 3 


1983 
1 

i j j j 
2012 


99 

98 
100 


983 
984 
985 


AB026125 
" Y 14482 

AB023888 


Homo sapiens 
Homo sapiens 

Homo sapiens 


ART-4 

' Fragment of human secreted protein encoded by 
gene 17. . — — 

K_r>Vi^mr»VinP TPfPntOr C(JR4 


2160 
172 

1895 


70 

100 
100 


986 
987 
988 


W27291 
AF 153450 


Homo sapiens 

Manduca 

sexta 


" Human H1075-1 secreted protein 5 end. 
juvenile hormone esterase binding protein 

Human secreted protein, SEQ ID NO. 7778. 


712 
226 

194 


32 
88 


989 
990 


G03697 
AF204159 


Homo sapiens 
Homo sapiens 


" potassium large conductance calcium-activated 
channel beta 3a subunit 
"Human secreted protein, SEO ID NO: 6142. 


1486 
558 


100 
99 


991 
992 

993 


G02061 
AL031266 

Y66749 


Homo sapiens 
L/fienomatxj m 
s elegans 
Homo sapiens 


VM106R.1 

Membrane-bound protein PRO 1 124. 
" Human secreted protein, SEO ID NO: 532 /. _ 


327 

4730 
141 


40 

99 
77 


994 
995 
996 


GO 1246 
AF133845 
AF 117756 


Homo sapiens 
Homo sapiens 
Homo sapiens 


conn 

" thyroid hormone receptor- associated protein 
complex component TRAP 1 50 
Human stern cell antigen 2. 


5811 
4999 

284 


99 
100 

93 


997 
998 

999 

1000 

1001 


W62066 
Y87173 

Y 13379 
Y95008 
AF190167 


Homo sapiens 
Homo sapiens 

Homo sapiens 
Homo sapiens 
~| Homo sapiens 


Human secreted protein sequence SEQ ID 
NO:21Z 

Amino acid sequence of protein PRU2o3. 
Human secreted protein vf3 1. SEQ ID N<J:M>. 
membrane associated protein SLP-2 


725 

1654 

676 

1747 


100 

99 
47 
100 



125 



WO 01/57188 
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ID 

NO: 


Accession 
No. i 


Species 


Description 

Human secreted protein, SEQ ID NO: 5315. 


Smith- 
Waterman 
Score 

398 | 


% 

Identity 
96 


1002 
1003 

1004 


G01234 
W73420 

X12791 


Homo sapiens 
Homo sapiens 

Homo sapiens 


i ' „ c.a/»r*»t*»/H r»mtfMn encoded bv Gene No. 

riuman secrcieo pi u icu i ciiw^-^j^/vj »» 

24. 

1 ytCu oKr-proiem i - i 


2150 r 

742 T 
642 


100 

100 
100 


1005 
1006 
1007 


M23323 
X63745 
Y35997 


Homo sapiens 
Homo sapiens 
Homo sapiens 


membrane protein 
KDEL receptor 

Extended human secreted protein sequence, SEQ 
ID NO. 3o2. 


326 
824 


98 
99 


1008 


AB032918 


Hylobates 
moloch 


dopamine receptor D4 


92 


35 


1009 

1010 
1011 


Y91680 

AL136125 
G03733 


Homo sapiens 

Homo sapiens 
Homo sapiens 


Human secreted protein sequence encoded by , 

gene 81 SEQIDNO:353. 

dJ304B14.1 (novel protein) 

Human secreted protein, SEQ ID NO: 7814. 


1372 

825 i 
379 ! 


99 

98 
98 


101? 

Ivli 

1013 
1014 

1015 
1016 
1017 


Y17531 f 

G00724 

AF288092 

AB045292 
XI 5940 
Y94873 


Homo sapiens 
Homo sapiens 
>Jaeg)eria 
gruberi 
Homo sapiens 
Homo sapiens 
Homo sapiens 


Human secreted protein clone BL205 14 protein. 
Human secreted protein, SEQ ID NO: 4805. 
haem lyase 

M83 protein 

ribosomal protein L31 (AA 1-125) 
Human protein clone HP02632. 
dJ417M14.1 (novel protein) 


818 
462 
114 

3867 
644 
1876 
589 


97 

100 

37 

99 
100 
100 
100 


mis 
1019 
1020 
1021 
1022 

1023 


AT 

X83425 
W03516 
G03960 
Y91689 

AEO0O660 


Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 

Homo sapiens 


Lutheran blood group glycoprotein 
Prostaglandin DP receptor. 
Human secreted protein, SEQ ID NO: 8041. 
Human secreted protein sequence encoded by 

gene 93 SEQ ID NO:362. 

hADV36Sl 


3054 
1864 
398 
768 

573 
1550 


99 
100 
100 
100 

100 
100 


1024 
1025 
1026 

1027 
1028 
1029 


AF132965 

W92380 

R66278 

X65614 
Y41741 
AJ0010U 


Homo sapiens 
Homo sapiens 
Homo sapiens 

Homo sapiens 
Homo sapiens 
Homo sapiens 


CGI-3 1 protein 

Human TR-interacting protein S3 03a. 
Therapeutic polypeptide from glioblastoma cell 
line. 

SI OOP calcium-binding protein 

Human PRO704 protein sequence, 

RAMP I 


1466 
830 

476 

1323 

806 


97 
100 

100 
100 

100 j 


1030 • 
1031 
1032 
1033 


W63682 
AK023007 
W97900 
Y82453 


Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 


Human secreted protein 2. 

unnamed protein product 

Human SR-BI class B scavenger. 

Human TGC-440 secretory protein SEQ ID 

NO:l. 


1354 
766 
2672 
639 


99 
100 
99 
99 


1034 


Y73473 


Homo sapiens 


Human secreted protein clone ydl78_l protein 
sequence SEQ ID NO: 168. 


752 


93 


1035 

1036 
1037 


Y86468 

U09813 
AJ242832 


Homo sapiens 

Homo sapiens 
Homo sapiens 


Human gene 48-encoded protein fragment, SEQ 
IDNO:383. 

mitochondrial ATP synthase subunit 9 precursor 
calpain 


96 

698 
3699 


90 

100 
99 


1038 
1039 
1040 - 

1041 
1042 


X66403 

AJ242730 

AF169968 

X 52563 
G00368 


Homo sapiens 
Homo sapiens 
Mus 

musculus 
Bos taurus 
Homo sapiens 


acetylcholine receptor epsilon subunit CHRNE 

polyhomeotic 2 

DNA binding protein DESK I 

permability increasing protein 

Human secreted protein, SEQ ID NO: 4449. 


2574 
1310 
1453 

383 
75 


100 
100 
80 

29 
50 


1043 
1044 
1045 


G02532 
M94582 
AL080239 


Homo sapiens 
Homo sapiens 
Homo sapiens 


Human secreted protein, SEQ ID NO: 6613. 
interleukin 8 receptor B 
" DG256022.1 (similar to IGFALS (insulin-like 
growth factor binding protein, acid labile 
subunit)) 


60 
" 1850 
1704 

580 


53 

100 

50 

100 


1 fl/lA 
1 LKIO 

1047 


/A.1 J ^ J ll/l 

W74809 


Homo sapiens 
Homo sapiens 


HSPC040 protein 

Human secreted protein encoded by gene 8 1 
clone HMWDN32. 
dJ1042KI0.4 (novel protein) 


176 
2201 


100 
100 


1048 
1049 

1050 


AL022238 
W88667 

AF097518 


Homo sapiens 
Homo sapiens 

Homo sapiens 


Secreted protein encoded by gene 134 clone 
HAIBP89. 

liver-specific transporter 


1559 
2820 


99 
100 
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SEQ 
ID 

NO: 


Accession 
No. 


Species 


Description 


Smith- 
Waterman 
Score 


% 

Identity 


1051 


W78324 


t-Iomo sapiens 


Fragment of human secreted protein encoded by 
gene 81. 


1318 


98 


1052 


Y21851 


Homo sapiens 


Human signal peptide-contianing protein (SIGP) 
(clone ID 2328134). 


1643 


95 


1053 

1054 
1055 


AL163815 

Y76200 
AJ276567 


Arabidopsis 
thaliana 
Homo sapiens 
Homo sapiens 


putative protein 

Human secreted protein encoded by gene 77. 
TCI 0-1 ike Rho GTPasc 

Human secreted protein encoded by gene No. 54. 


661 

262 

1160 

154 


62 

100 
100 
96 


1056 
1057 
105o 
1059 

1060 
1061 ' 
1062 


Y27620 
D14530 

AT \DZXrJ\J 

AL031778 

AF227135 

Y27575 

Z11697 


Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 

Homo sapiens 
Homo sapiens 
Homo sapiens 


ribosomal protein 
TADA1 protein 

dJ34B2I.l (novel BZRP (benzodiazapine 

receptor (peripheral) (MBR, PBR, PBKS, IBP, 

Isoquinoline-binding protein)) LUCE protein) 

candidate taste receptor T2R9 

Human secreted protein encoded by gene No. 9. 

HB15 

putative transmembrane protein 


745 

1132 

920 

134 
1392 
1088 
819 


100 
100 
100 

33 1 
100 
100 
100 


1063 
1064 
1065 
1066 
1067 


AF123757 

AF155135 

Y41674 

AJ250042 

Y36087 


Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 


novel retinal pigment epithelial cell protein 
Human channel-related molecule HCRM-2. 
Rab5 GDP/GTP exchange factor homologue 
Extended human secreted protein sequence, SEQ 
ID NO. 472. 


936 

2575 

770 


99 
99 
100 
85 


1068 


Y94959 


Homo sapiens 


Human secreted protein clone mc300_l protein 
sequence SEQ ID NO: 124. 


301 


100 


1069 

1070 
1071 
1072 
1073 


Y94959 

W64535 
X03145 
AL031177 
X82200 


Homo sapiens 

Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 


"Human secreted protein clone mc300_l protein 
sequence SEQ ID NO: 124. 
Human leukocyte cell clone HP00804 protein, 
pot. ORE 111 

dJ889M15.3 (novel protein) 

gpStaf50 


301 

2014 
148 
821 
249 


100 

99 
50 
91 
62 


1074 
1075 
1076 
1077 
1078 

1079 
1080 


G03213 
Y36233 
G03187 
L25899 
Y91447 

GO 1862 
AB039723 


Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 

Homo sapiens 
Homo sapiens 


Human secreted protein, SEQ ID NO: 7294. 
Human secreted protein encoded by gene 10. 
Human secreted protein, SEQ ID NO: 7268. 
ribosomal protein LIO 

Human secreted protein sequence encoded by 

gene 48 SEQ ID NO: 168. 

" Human secreted protein, SEQ ID NO: 5943. 
YVNT receptor frizzled-3 


99 

jUO 

424 
332 
898 

290 
1376 


47 | 

55 

98 

76 

97 

89 
92 


1081 
1082 
1083 

1084 
1085 
1086 


AB020527 

L13802 

W75098 

G03564 
G04063 
AF090942 


Homo sapiens 
Homo sapiens 
Homo sapiens 

Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 


Na/P04 cotransportcr homo log 

ribosmal protein small subunit 

Human secreted protein encoded by gene 42 

clone HSXB125. 
" Human secreted protein, SEQ ID NO: 7645. 
" Human secreted protein, SEQ ID NO: 8144. 

PRO0657 

Human secreted protein, SEQ ID NO: 4598. 


269 

143 

83 
88 
124 
129 


100 

80 

81 

51 
43 
64 

41 J 


lUo/ 

1088 
1089 
1090 
1091 
1092 


G04091 
AF 140631 
G04063 
S72304 
W88708 


Homo sapiens 
Homo sapiens 
Homo sapiens 
Mus sp. 
Homo sapiens 


Human secreted protein, SEQ ID NO. 8172. 
G-protein coupled receptor 14 
Human secreted protein, SEQ ID NU. 8144. 
LMW G-protein 

Secreted protein encoded by gene 175 clone 
HEMAM41. 


126 
364 
114 
146 
405 

4358 


36 
82 
32 
83 
100 

97 


1093 
1094 


W85612 
Y53012 


Homo sapiens 
Homo sapiens 


Secreted protein clone tn 123 5. 

Human secreted protein clone pm514_4 protein 

sequence SEQ ID NO:30. 


1013 


99 


1095 

10% 
1097 


Y92345 

AF090942 
L24521 


Homo sapiens 

Homo sapiens 
Homo sapiens 


Human cancer associated antigen precursor from 

clone NY -REN -62. 

PRO0657 

transformation-related protein 


409 

147 
166 


100 
60 

^8 


1098 
1099 
1100 


X56932 
G04063 
Y02693 


Homo sapiens 
Homo sapiens 
Homo sapiens 


23 kD highly basic protein 
" Human secreted protein, SbQ ID NO: 8144. 
Human secreted protein encoded by gene 44 
clone HTDAD22. 


490 

83 

149 


70 
35 
59 
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SEQ 
ID 
NO: 
1101 


Accession 
No. 

API 19851 


Species 
jiOmu sd.picu> 


Description 
PRO 177? 

Human secreted protein, SEQ ID NO: 8167. 


Smith- r 

Waterman 

Score 

183 

207 


% 

Identity 

72 
62 


1102 
1103 
1104 

1105 
1106 


G04086 
G04063 
X74856 

G03789 
G03133 


Homo sapiens 
Homo sapiens 
Mus 

musculus 
Homo sapiens 
Homo sapiens 


Human secreted protein, SEQ ID NO: 8144. 
ribosomal protein L28 

Human secreted protein, SEQ ID NO: 7870. 
Human secreted protein, SEQ ID NO: 7214. 


91 
128 

130 
122 


52 
69 

62 
48 


1107 
1108 
1109 


G03040 

AF039942 

AF201951 


Homo sapiens 
Homo sapiens 
Homo sapiens 


Human secreted protein, SEQ ID NO: 7121. 
HCF-binding transcription factor Zhangfei 
high affinity immunoglobulin epsilon receptor 
beta subunit 


69 

744 

738 


43 

OQ 

77 

"94 


1110 
1111 


AF1 11108 
AF 119900 


Mus 

musculus 
Homo sapiens 


transient receptor potential 2 
PR02822 

A protein that interacts with presenilis. 


223 

144 

265 


79 

59 
39 


1112 
1 113 
1114 


Y16589 
G02872 
Y02999 


Homo sapiens 
Homo sapiens 
Homo sapiens 


Human secreted protein, SEQ ID NO: 6953. 
Fragment of human secreted protein encoded by 
gene 121. 

Human secreted protein encoded from gene 1. 


178 
164 

1217 


67 
63 

"99 


1115 
1116 

1117 


Y30811 
X51394 

M27826 


Homo sapiens 

Xcnopus 

laevis 

Homo sapiens 


APEG precursor protein 

neutral protease large subunit 

Human secreted protein, SEQ ID NO: 7452. 


130 

442 
72 


40 

65 
60 


1116 
1116 

1119 
1120 

1121 
1122 
1 123 


G03602 
Y35906 

G03714 
Y00337 
AF084830 


Homo sapiens 
Homo sapiens 
Homo sapiens 

Homo sapiens 
Homo sapiens 
Homo sapiens 


Human secreted protein, SEQ ID NO: 76ttJ. 
Extended human secreted protein sequence, SEQ 
ID "NO. 155. 

Human secreted protein, SEQ ID NO: 7795. 
"Human secreted protein encoded by gene 81. 
two pore domain K+ channel; 7 ASK.-2 


491 
244 

122 
110 

703 
442 


97 
97 

65 
90 
94 

o o 

OO 


1124 
1125 

1 l ZD 

1127 
1128 


AF2 12862 

W64469 

G01361 

G01361 

Y84320 


Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 


membrane interacting protein of RGS16 
Human secreted protein from clone C W795_2. 
Human secreted protein, SEQ ID NO: 5442. 
Human secreted protein, SEQ ID NO: 5442. 
Human cardiovascular system associated protein 
kinase- 1. 

Human secreted protein, SEQ ID NO: 6186. 


191 
154 
165 
815 

88 


53 
100 
100 
99 

73 


1129 
1130 


G02105 
Y32923 


Homo sapiens 
Homo sapiens 


Transmembrane domain containing protein clone 
HP0Ij12. 

Human synapse related glycoprotein 2. 


700 
260 


100 
91 


1131 
1132 


Y29817 
Y91644 


Homo sapiens 
Homo sapiens 


Human secreted protein sequence encoded by 
gene 43 SEQEDNO:317. 


525 


96 


1133 
1134 


Y91449 
AB017908 


Homo sapiens 
Homo sapiens 


u.. roA»tpH nrntAin cf*mi<*nce encoded DV 

Human secreted prut-cm act|utnv^. ^ V j 
gene 49 SEQ ID NO: 170. 
4F2 light chain 


542 
2399 


100 
93 


1 135 
1136 

1137 


X51760 
Y99426 

G03790 


Homo sapiens 
Homo sapiens 

Homo sapiens 


zinc finger protein (583 AA) 

Human PRO1604 (UNQ785) amino acid 

sequence SEQH>NO:308. 

Human secreted protein, SEQ ID NO: 7871. 


312 
917 

102 


55 
72 

50 
oi 


1138 
1139 


AF155106 
AL031055 


Homo sapiens 
Homo sapiens 


NY-REN-36 antigen 

dJ28H20.1 (novel protein similar to membrane 


768 
117 

138 


yi 
50 

""96 


1140 
1141 


AF0 11359 
Y70018 


Bos taurus 
Homo sapiens 


regulator of G-protein signaling 7 
Human Protease and associated protein- 12 
(PPRG-12). 

Human secreted protein, SEQ ID NO: 8172. 


623 
113 


100 
38 


1142 
1143 


G04091 
AB030235 


Homo sapiens 

Canis 

familiaris 


D4 dopamine receptor 


89 


48 


1144 

1145 
1146 
1147 
1148 
| 1149 


Y94922 

X99962 
G03807 
G03712 
Y28279 
U 13642 


Homo sapiens 

Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 
Caenorhabditi 


Human secreted protein clone pv6_I protein 

sequence SEQ ID NO:50. 

rab-related GTP-bmding protein 

Human secreted protein, SEQ ID NO: 7888. 

Human secreted protein, SEQ ID NO: 7793. 

Human G-protein coupled receptor GRIR-1. 

exon 5 similar to transmembrane domain of S. 


539 

398 
168 
512 
705 
247 


88 

96 
79 
85 
76 
36 
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SEQ 
ID 

NO: 



1150 



1151 



1152 



1153 



1154 



1155 



1156 



Accession 
No. 



G03438 



G01003 



G03798 



X88799 



D85245 



R74272 



Y86265 



1157- G02577 



159 



H58 AF 104334 



1161 



1162 



GO 1393 



Species 



s elegans 



Homo sapiens 



Homo sapiens 



Description 



cerevisiac zinc resistance protein 



Hmnan s ecreted protein, SEQ ID NO: 7519. 



IWa^a protein, gQmN O[50g: 



Smith- 
Waterman 
Score 



117 



181. 



Homo sapiens 



Oryza saliva 



Homo s apiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Human secreted protein, SEQ ID NO: 7879. 
DNA binding protein 



TR3beta 



Tumour suppressor protein, p53. 



Human secreted protein HUSXE77, SEQ ID 
NO: 180. 



Human w.reted protein. SEQ ID NO: 6658. 



198 



95 



155 



341 

99 



263 



Homo sapiens 



W75771 



AF2 16833 



W67816 



Homo sapiens 



Homo sapiens 



Homo sapiens 



putative organic anion transporter 



Human secreted protein, SEQ ID NO: 5474. 



Human GTP binding protein APPQ8 



224 



M-ABC2 protein 

Human secreted protein encoded by gene 10 
clone HCEMU42. 



85 



173 



10 



156 



% 

Identity 



62 



80 



63 



41 
96 



W 
41 



98 



42 



57 



81 



83 



100 



1163 



AF1 19851 



Homo sapiens 



PRO 1722 



1164 



Y87252 



Homo sapiens 



Human signal peptide containing protein HSPP- 
29 SEQIDNO:29. 



13 



Human liver cell clone HPOI 148 protejrL 



338 



31 



82 
64 



1165 
1166 



W64537 
AF269286 



Homo sapiens 
Homo sapiens 



H C6 

Fragment of human secreted protein encoded by 

gene 17. 



1167 



1168 



1169 



1170 



1171 



1172 



1173 



1174 



1175 



1176 



1177 



1178 



1179 



1180 
1181 



1185 



1186 



1187 



1188 



1189 



1190 



1191 



Y 14482 



Homo sapiens 



D90789 



R63783 



Y45274 



D64154 



AB026256 



G00357 



D87717 



M64716 



R08330 



Escherichia 
coli 



g ^iiv * f ♦ . . : — 

Dipeptide transport system permease protein 
DppC 



411 



Homo sapiens 



Homo sapiens 



TG0847 protein. 

Human secreted protein en coded from gene 18. 



344 



478 



Homo sapiens 



Mr 110,000 antigen" 



347 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



LO6505 



AJ251885 



G03258 



AF181856 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Rattus 
norvegicus 



G03797 



G03564 



AB032905 



G00956 



G03258 



G03361 



AF 117755 



Homo sapiens 



Homo sapiens 



Hylobates 
concolor 



Homo sapiens 



Homo sapiens 



organic anion transporter OATP-B 

jwian^cr^ SEQ ID NO: 4438. 



311 



60 



Similar to human GTPase-activating 
protein(A49869) 



178 



ribosomal protein 



391 



Human IL-7 receptor clone H6. 



285 



ribosomal protein LI 2 



242 



organic cation transporter (OCT2) 



276 



Ul gallic ^v^.-r^ - y- x 

Human secreted protein, SEQ ID NO: 7339 



155 



nuiuau 36i, i^vu p.x-^w..., — ^ ^ 

Human secreted protein, SEQ ID NO: 5288^ 



tRNA selenocysteinc associated protein 



282 
249 



Human secreted protein, SEQ ID NO: 7878 



Human secreted protein, SEQ ID NO: 7645. 
dopamine receptor D4 



88 
118 



96~ 



-Human ^.reted p rotein. SEQ ID NO: 5037." 

: — : TTr-y^. Yr\ irr\. 



292 



Homo sapiens 



Homo sapiens 



H uman secreted protein, SEQ lD NO: 7339. 
Human secreted protein, SEQ ID NO: 74427 



178 



324 



thyroid hormone receptor-associated protein 
complex component TRAP230 



187 

20T 



90 



90 
9T 



96 



67 



52 
59 



78 
67 



72 




88 



71 



90 



62 



46 



78 



79 



76 



70 
67" 



1192 



1193 



1194 



1195 



1196 



1197 



Y70455 



Homo sapiens 



Human membrane channel protein-5 (MEUriP- 



5). 



G03052 



Homo sapiens 



Human arreted protein, SEQ ID NO: 7133. 



99 



G02607 



Homo sapiens 



W29661 



Homo sapiens 



H uman secreted protein, SEQ ID NO: 6688. 



2001 



Y14104 



Homo sapiens 



X61972 



Homo sapiens 



Human GABAB receptor Id protein sequence. 
macropain subunit iota 



239 



Human secreted protein, SEQID NO: 4615. 
Human secreted protein HELHN47, SEQ ID 
NO: 175. 




149 
145 



42 



76 



98 



69 



90 
51 



129 



WO 01/57188 



PCT/USO 1/03800 



SEQ 

ID 

NO: 

1201 

1202 



Accession 
No. 



Species 



Description 



Human secreted protein, SEQ IP NO: 4919. 



Smith- 
Waterman 
Score 



404 



% 

Identity 



50 



1203 



1204 



1205 



1206 



M27826 



Homo sapiens 



neutral protease large subunit 

Human secreted protein clone yi4_l protein 
SEQ ID NO:70^ 



Y73424 



Homo sapiens 



265 



AF264014 



Homo sapiens 



s equence 3r,y mj iw.iv. ^ 

scavenger receptor cysteine-rich type I protein 



625 



Y36203 



U78111 



Homo sapiens 



Ml 60 precursor __ 
Human secreted protein #75. 



Gallus gallus 



AQ 



. — 

putative G protein-coupled receptor 



219 



205 
416 



61 



98 



59 
57 
76 
75 



1207 
1208 



1209 



1210 



1211 



1212 



1213 



1214 



1215 



1216 



1217 



1218 



1219 



1220 



1221 



1222 



AF116715 



Homo sapiens 



PR02829 



AF099137 



Homo sapiens 



MaxiK channel beta 2 subunit 



AF205718 



Homo sapiens 



Y27868 



G00719 
GO 1009 



AF090942 



Y 14427 



G03905 



Y57897 



J00194 



Homo sapiens 



hepatocellular carcinoma-related putative tumor 



475 
423 



suppressor 



Human secreted protein encoded by gene No. 



224 



107. 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Y59709 



W81576 



W96745 



Y35911 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Human secreted protein, SEQ ID NO: 4800. 



Human secreted protein, SEQ ID NO: 5090. 



117 
351 



PRO0657 



124 



Human secreted protein encoded by gene 17 



99 



clone HS1EA14. 



Human secreted protein, SEQ ID NO: 7986. 



Human transmembrane protein HTMPN-21. 



hla-dr antigen alpha chain 



Secreted protein 76-28-3 -A 12-n^L 



EBV-induced G-protein coupled receptor (EBI- 
2) polypeptide. 



High affinity immunoglobulin E receptor-like 
protein (IGERB) 



Extended human secreted protein sequence, 
ID NO. 160. 



173 



1173 



454 



470 



725 



135 



95 
79 



70 



1223 



1224 



1225 
1226 



1227 



1228 



1229 



1230 



1231 



1232 



1233 



1234 



1235 



1236 
1237 
1238 



1239 



1240 



1241 



1242 



1243 



1244 



1245 



1246 



1247 



Y00278 



Homo sapiens 



j±J J><W. X w. . 

Human secreted protein encoded by gene 21 . 



AF161422 



Homo sapiens 



HSPC304 



260 
568 



U14970 



Homo sapiens 



ribosomal protein S5 



G01733 



Homo sapiens 



AF099973 



Mus 

musculus 



Human secreted protein, SEQ ID NO: 5814. 
schlafen2 



G01218 



Homo sapiens 



Human secreted protein, SEQ ID NO: 5299. 



AF217188 



Mus 

musculus 



YIP1B 



AF176813 



Homo sapiens 



soluble adenylyl cyclase 



X98333 



W74955 



Homo sapiens 
Homo sapiens 



organic cation transporter 

Human secreted protein encoded by gene 77 
clone HOEAS24. 



Y94940 



U76618 



AF044924 



G01459 

AFOOO018 

W88633 



W29660 



AF0O4I6I 



Y92710 



Y95002 



Y44905 



Homo sapiens 



Human secreted protein clone yi62_l protein 
sequence SEQIDNO:86. 



Mus 

musculus 



N-RAP 



Homo sapiens 



hook2 protein 



Homo sapiens 
Homo sapiens 
Homo sapiens 



Human secreted protein, SEQ ID NO: 5540. 

adapter protein ________ 

Secreted protein encoded by gene 100 clone 
HE8EU04. 



Homo sapiens 



Homo sapiens CH27J clone secreted protein. 



Oryctolagus 
cunicuius 



peroxisomal Ca-dependent solute carrier 



Homo sapiens 



Human membrane-associated protein Zsig24 



Homo sapiens 



Human secreted protein vc34_l, SEQ ID NO:44. 



Homo sapiens 



Human potassium channel molecule ERG-LP2 
partial protein. 



AF284422 



Homo sapiens 



cation^chloride cotransporter- interacting protein 



Y53629 



AB039371 



Y35911 



Homo sapiens 



A bone marrow secreted protein designated 
BMS115. 



Homo sapiens 



mitochondrial ABC transporter 3 



Homo sapiens 



Extended human secreted protein sequence, SEQ 



202 



610 



333 



155 



801 



275 



1704 

212 



526 
48T 



380 



417 
164 
250 



697 



154 



709 



908 



325 



511 



1888 



389 



168 



130 



WO 01/57188 



PCT/US01/03800 



SEQ 
ID 

NO: 


Accession 
No. 


Species 


Description 


Smith- 
Waterman 
Score 


% 

Identity 


1248 


AF072509 


Rattus 
norvegicus 


ID NO. 160 

glutamate receptor interacting protein 2 

tandem pore domain potassium channel TRAAK 


559 
661 


OA 

98 


1249 
1250 


AF247042 
B08974 


Homo sapiens 
Homo sapiens 


Human secreted protem sequence encoded by 
gene 27 SEQ ID NO: 131. 


1087 


97 

SQ. 


1251 


L15313 


Caenorhabditi 
s clcgans 


putative 


o< 0 
OJO 




1252 

1253 
1254 
1255 


Y29338 

W01730 
G03074 
G01818 


Homo sapiens 

Homo sapiens 
Homo sapiens 
Homo sapiens 


Human secreted protein clone it217_2 alternate 
reading frame protein. 
Human G -protein receptor HPRAJ70. 
Human secreted protein, SEQ ID NO: 7155. 
Human «creted protein, SEQ ID NO: 5899. 


278 

211 
294 
253 


75 
92 
91 


1256 
1257 
1258 
1259 


AF2 86368 
AF220264 
G02227 
Y07970 


Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 


eppin-1 
VlOST-l 

Human secreted protein, SEQ ID NO: 6308. 
Human secreted protein fragment #2 encoded 
from gene 26. 


222 
87 
281 
81 


54 
93 
78 
94 


1260 
1261 


R95332 
AF 140674 


Homo sapiens 
Homo sapiens 


Tumor necrosis factor receptor 1 deajh domain 

ligand (clone 3TW). 

zinc metal loprotease ADAM i bo 


986 
172 


100 
36 


1262 
1263 


U28369 
Y07049 


Homo sapiens 
Homo sapiens 


semaphorin V 

Renal cancer associated antigen precursor 
sequence. 

Human secreted protein #25. 


237 

288 1 
187 


67 
71 

80 


1264 


Y36153 

V7R1 Id 

I /OJ JT 


Homo sapiens 
Homo sapiens 


Human cytokine signal regulator CKSR-2 SEQ 
IDNO:2. 

Amino acid sequence of protein PR0334. 


723 
191 


93 
100 


1266 

1 OAI 
iZO / 


Y13397 


Homo sapiens 

Rattus 

norvegicus 


phosphatidylinositol 5-pnosphate 4-kinase 
gamma 

candidate tumor suppressor gene LUCA-1 


859 
159 


95 
96 


1268 
1269 


U73167 
AF 190664 


Homo sapiens 
Mus 

musculus 


LMBR2 


552 


76 


1270 
1271 


AL050332 
G02126 


Homo sapiens 
Homo sapiens 


dJ570F3.1 (homolog ot the rat synaptic ras 
GTPase-activating protein pi 35 SynGAP) 
Human secreted protein, bfc^ iu inu. o^v/. 


820 
131 


98 
95 


1272 
1273 


AF125533 
AL035661 


Homo sapiens 
Homo sapiens 


NADH-cytochrome b5 reductase isoform 
dJ568Cl 1.3 (novel AMP-bindmg enzyme 
similar to acetyl-coenzyme A synthethase 
(acetate-coA ligase)) 


253 
1280 


92 
100 


1274 


AF064748 


Mus 

musculus 


S3-12 


3523 
377 


61 

78 


1275 
1276 

1277 


D17554 
Y30715 

AF146760 


Homo sapiens 
Homo sapiens 

Homo sapiens 


TAXREB107 

a M : n rt nfkA c t*nt\ »nrp nt a hlimiUl SfiCreteQ 

Amino acio scijuciiwc ui * uuui«a* ova<iv»vu 
protein. 

scptin 2-Iike cell division control protein 


643 
707 


90 
100 


1278 
1279 

1280 


Y05069 
X59668 

G01051 


Homo sapiens 
Oryctolagus 
cuniculus 
Homo sapiens 


Human PIGR-2 protein sequence, 
aorta CNG channel (rACNG) 

Human secreted protein, SEQ ID NO: 5132. 
Human secreted protein, SEQ ID NO: 7492. 


281 
267 

489 
120 


46 

Of 

98 
43 


1281 
1282 
1283 


G03411 

AF055084 

AF117814 


Homo sapiens 
Homo sapiens 
Mus 
1 musculus 


very large G-protein coupled receptor- 1 
odd-skipped related 1 protein 


1635 
357 


100 
98 


1284 


U87318 


Xenopus 
laevis 


NaDC-2 


535 


60 


1285 


AF061346 


Mus 
! musculus 


Edpl protein 


452 


68 


1286 


AB030182 


Mus 

musculus 


contains transmemDranc ^iivi; it&ivu 


582 


68 


1287 


A13595 


synthetic 
construct 


immunosuppresive protein PP15 


185 
837 


97 
100 


1288 
1289 


AF254411 
AF084205 


j Homo sapiens 
! Rattus 
| norvegicus 


ser/arg-rich pre-mRNA splicing factor SR-A1 
serine/threonine protein kinase TAOl 


319 


98 



131 



WO 01/57188 



PCT7US01/03800 



SEQ 
ID 

NO: 


Accession 
No. 


Species 


Description 

membrane associated guanylate kinase 2 


Waterman 

Score 

523 


% 

Identity 
100 


1290 
1291 


AF038563 
AF034837 


Homo sapiens 
Homo sapiens 


double-stranded RNA specific adenosine 
deaminase 


468 
937 


100 
87 


1292 
1293 


Ml 5888 
ABO 10692 


Bos taurus 

Arabidopsis 

thaliana 


endozepine-relatcd protein precursor 
ATP-dependent RNA helicase-hkc protein 


636 
1570 


45 
100 


1294 
1295 


AF209923 
W67828 


Homo sapiens 
Homo sapiens 


orphan G-protein coupled receptor 
Human secreted protein encoded by gene 22 
clone HFEAF41. 


504 


98 
65 


1296 


AC004832 


Homo sapiens 


similar to 45 kDa secretory protein ; similar Lu 
CAA10644 1 (PTD:g4164418) 


648 




1297 
1298 


X80035 
G02645 


Oryctolagus 
cuniculus 
Homo sapiens 


cysteine rich hair keratin associated proicm 
Human secreted protein, StQ ID NO: 6 rib. 


575 
223 


70 

97 j 
32 


1299 
1300 


Y59440 
W70504 


Homo sapiens 
Homo sapiens 


Human detta3 fragment #4. 

Leukocyte seven times membrane-penetrating 

type receptor protein JEG1 8. 


122 
459 


81 

"99 


1301 

1302 
1303 


Y67315 

M77693 
G01331 


Homo sapiens 

Homo sapiens 
Homo sapiens 


Human secreted protein BL89J3 amino acid 
sequence. 

spermidine/spermine N 1 -acetyltransf erase 
Human secreted protein, ID NO: 5412. 
Human secreted protein, SEQ ID NO: 5572. 


3916 

174 
254 

7/17 


96 
69 
99 


1304 
1305 
1306 
1307 

1308 


G01491 
AF 148509 
G01658 I 
Y90899 

AF033120 


Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 

Homo sapiens 


alpha 1 J2-mannosidase 

Human secreted protein, SEO ID NO: 5739. 

Dl-like dopamine receptor activity modifying 

protein SEQ ID NO: I. 

p53 regulated PA26-T2 nuclear protein 


602 
333 
332 

348 
147 


98 
98 
98 

52 1 
66 


1309 
1310 
1311 


Y73388 

AF063243 

AF224494 


Homo sapiens 
Bos taurus 
Mus 

musculus 


HTRM clone 3376404 protein sequence. 

ribosomal protein L30 

arsenite inducible RNA associated protein 

" ljtrm rlnne 2709055 protein sequence. 


zyo 

688 
1154 


7v 

70 ; 
100 


1312 
1313 


Y73342 
Y99419 


Homo sapiens 
Homo sapiens 


Human PRO1780 (UNQ842) amino acid 
sequence SEO IDNO:282. 

ODfil 777 


1145 
433 


78 
97 


1314 
1315 

1316 


AF1 16667 
W75100 

AJ272078 


Homo sapiens 
Homo sapiens 

Homo sapiens 


rKvJ if// _ - — 

Human secreted protein encoded by gcuc 44 

APOBEC-1 stimulating protein 


807 
789 


97 

100 
98 


1317 
1318 


AB041533 
U19617 


Homo sapiens 
Mus 

musculus 


sperm antigen 
Elf-1 


2607 
806 


92 
100 


1319 


U82598 


Escherichia 
coli 


ternc enicroDai^iui u<uid|*-«v f* wl 


768 


100 


1320 


D90892 


Escherichia 
coli 


SORBlTOL-6-PHOSPHAifc z- 
DEHYDROGENASE (EC 1.1.1.140) 
(GLUCITOL-6- PHOSPHATE 
DEHYDROGENASE) (KETOSEPHOSPHATE 
REDUCTASE) 


709 


92 


1321 
1322 


W67847 
AJ276101 


Homo sapiens 
Homo sapiens 


Human secreted protein encoded by gene 41 
clone HPBCJ74. 
GPRC5B protein 


601 

466 
504 


93 
97 


1323 
1324 
1325 


AJ276101 

Y58628 

U91561 


Homo sapiens 
Homo sapiens 
Rattus 
norvegicus 


GPRC5B protein 

Protein regulating gene expression PRUb-ll. 
pyridoxine S'-phosphate oxidase 

NADH-cytochrome b5 reductase isoform 


1584 
1277 

1606 


100 
89 

100 


1326 
1327 

1328 


AF125533 
Y32206 

AF15104B 


Homo sapiens 
Homo sapiens 

Homo sapiens 


" Human receptor molecule (REC) encoded by 
Incyte clone 2825826. 
HSP^714 


1531 

657 
1645 


90 

85 
100 


1329 
1330 
1331 

1332 
1333 


Y10530 
AF 180681 
AF111856 

Y13583 
AF078866 


Homo sapiens 
Homo sapiens 
Homo sapiens 

Homo sapiens 
Homo sapiens 


olfactory receptor 

guanine nucleotide exchange factor 

" sodium dependent phosphate transporter isororm 
NaPi-3b 

G-protein coupled receptor 
SURP-4 


4314 

3591 

2171 
1395 


99 
99 

100 
100 



132 



WO 01/57188 



POYUS01/03800 



SEQ 
ID 

NO: 


Accession 
No. 

Y25755 


Species 
Homo sapiens 


Description 

Human secreted protein encoded from gene 45. 


Smith- 
Waterman 
Score 
1380 
4742 


% 

Identity 

96 
99 


1335 
1336 
1337 

1338 
1339 


AF 152325 

X74070 

AF095927 

G03877 
AL008582 


Homo sapiens 
Homo sapiens 
Rattus 
norvegicus 
Homo sapiens 
Homo sapiens 


protocadherin gamma A5 

transcription factor B 1 

protein phosphatase 2C 

Human secreted protein, ot^ w 1NW - /7JO - 
bK223H9.2 (ortholog of A. thaliana F23F1.8) 
leukemia inhibitory factor receptor 


639 
1931 

621 
626 
5820 


81 
95 

100 
100 
99 


1 1Af\ 

1 J4U 
1341 
1342 
1343 


AO 10 1 J 

Y01519 

AF2076OO 

U54807 


Homo saniens 
Homo sapiens 
Homo sapiens 
Rattus 
norvcgicui) 


A carcinogenesis- inhibiting protein, 
ethanol amine kinase 
GTP-binding protein 


7528 
2372 
1167 


97 
' 100 
97 


1344 


AC020579 


Arabidopsis 
thaliana 


putative phosphoribosytformylglycmamidine 
synthase; 25509-29950 


3283 
944 


51 
100 


1345 
1346 

1347 


Y28576 
W74787 

M55542 


Homo sapiens 

JtlOJntJ >u^JwJ1j 

Homo sapiens 


Secreted peptide clone pe503 1 . 

Human secreted protein encoded by gene 58 

clone HHFHN61. 

guanylate binding protein isoform 1 


1171 

2636 
1329 


100 

87 
100 


1348 
1349 
1350 


AF183428 

U70669 

AF295530 


Homo sapiens 
Homo sapiens 
Homo sapiens 


28.4 kDa protein 
Fas-ligand associated factor 3 
cardiac voltage gated potassium channel 
modulatory subunit 


167 

562 


24 
99 



TABLE 3 



SEQID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQID 
NO: of 
peptide 
seo- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted " 

beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence ! 


Amino. acid sequence (A-Alanine OCysteinc, 
D=Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
Msoieucine, K=Lysine, L=Leucine, 
M^Mcthionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V-V aline, W-Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 


1 


1351 


A 


2 


337 


1 


TPSUHQAPTPCPAGLWO/PHNGHYHCiS^L 
HWPQAPHRA* * * GLLPPR WLGHGLPGG P AAP 
WAASQWVDGVAGRLPGPAWSWHASGAAPA 

OPGPL'LLVPGSSGLPDPRDP 


2 


1352 


A 


27 


100 


366 


IRNSSIRPMKERETKLSAKHMITCSASYDIKCjL 
QIETT\YHHTPrRMAKJQKT/GHHQC**ECGAT 
GTL1HGWWGCKVVEPLGKTVWQIPK 


3 


1353 


A 


40 


3 


314 1 


. HASAHASVVLKDNSELEQQLGATGAYKAKA 
LELEAEVAEMRQMLQLEHPFVNGADKLRPD 
SMYVHLNEL*QSLVENMLLTVVDTHVRTPI*R 
SCNYTLALILFL 


4 


1354 


A 


74 


2 


292 


TASALFSCPDGGSLAGFAGRRASFHLECLKX 
QKDRGGDISQKTVLPLHLVHHQVAHTFGQAT 
VTCQQARQSPG* RTNPE/ALQWVLPVSDG WH 
VLPLP 


5 


1355 


A 


78 


114 


850 


ENCRVASNLPGVFFSEDTAQSGSYMKJSAH^f 
NAGOEVSNGPKRKLTLMLNFSLPSSGLNAGA 
FYALSTLLNRMV1\VHYPGEEVNAGRIGLT1V1 
AGMLGAV1SGIWLDRSKTYKETTLVVYIMDT 
GG A W WC YTF YLGTGDTCG* CFITAGVTMGFF 
MTGYLPLGFEFAVELVSYPESEGISSGLLNISA 
QVFGIIFTISQGQ1IDNYGTKPGNIFLCVFLTLG 
A ALTAFIKADLRRQKANKETLEN 


6 


1356 


A 


81 


97 


376 


EWFSYMLGSNMSVYHSP^SLEPLCKVLSKS-A 
YLR VPFIRILLNAR* IRKAYKRMSLEIKLLI/RE 
♦CLFQEMGLSLQWLYSARGDFFRATSRL 


7 


1357 


A 


93 


2 


872 


TLSSACLIGDAWKELT1VAGAVSNQLLVWYF 
ATALADNKPVAPDRRISGHVGIIFSMSYLESK 
GLLATASEDRSVRIWKGGDLRVTGGRVQN1G 
HCFGHSARWVQVKLLENYLISAGEDCVCLV 
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SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
>eptide 
seq- 
uence 


Viet 

hod 


SEQ 
ID NO: 
m 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 

corrcspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, 
D=Aspartic Acid, E^Glutamic Acid, 
F-Phenyialaninc, G-Glycine, H-Histidine, 
T_i cr ,i ( » 1 | f »; nr K— T vsine L—Leucine. 
M=Methionine, N=Asparagine, P=Proline, 
Q=01utamine, R=Arginine, S^Serine, 
T=Threonine, V- Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop cod on, 
/-possible nucleotide deletion, \=possible 
nucleotide insertion 














WSHEGEILQAFRGHQGRGIRA1AAHERQAWV 
ITGGDDSG1RLWHLVGRGYRGLG/DLGSLLQ 
VP* * ARYTQGCDSGWLLATAGSD* YRGPVSL 
•Donnvr nA A ARG*TFPVLLPAGGSSWSRGL 
RIVCYGQWGRSCQGCPHQHSNCCCGPDPVS 
wrnr: a cw ft OP AWT 


8 


3358 


A 


106 


3 


350 


FSSLLSGRISTLRDETGAILIDGDPAACAPIIKr 
LLTEELHLRGVSIYVLRHEAQIYGITPLWCAL 
Ll/CRRL* SDSCMRAALNDRGLYQVLILDGLV 
OCLGFVDSDSRKMVSTLT 


9 


1359 


A 


115 


49 


186 


QAWAJFKGKYKEGDTGGPAVWKTRLRCALN 
KSSEFNEGPERERMDV 


10 


1360 


A 


123 


2 


1249 


KGCRTQEKVDRTEVIRTCINPVYSKLF1 VD* Y 

FEEVQRLRFEVHDISSNHNGLKEADFLGGME 

CTLGQIVSQRKLSKSLLKHGNTAGKSSITVIA 

EELSGNDDYVELAFNARKLDDKDFFSKSDPF 

LEIFRMNDDATQQLVHRTEVVMIWLSPAWK 

SFKVSVNSLCSGDPDRRLKCIVWDWDSNGK 

HDFIGEFTSTFKEMRGAMEGKQVQWECINPK 

YKAKKKNYXNSGTV ILNLCRlHKJvmarL.u 1 1 

MGGCQIQFTVAIDFTASNGDPRNSCSLHYIHP 

YQPNEYLKALVAVGEICQDYDSDKMFPAFGF 

GARIPPEYTDSHDFAINFNEDNPECAGIQGW 

EAYQSCF\PKAPTFTGPTNICPHSSRKVAKFRR 

SEGK*HQGRAFAIIF1LVDPGQVGVYSQDMGP 

DNPGGHFV 


11 


1361 


A 


147 


614 


9 


ACARKQLLGRTVr lwr vuv^lluvj cx-raj i orw i 

NTTSSRPASSRG\TLSSSSSSSSSLTKDALPSSL 

KSDSTTITSGLVFPFRSLCVNPAKSSVSESVSSI 

KILLSSSVKYLE*KRTSCCFPDSSESKLSQLSS 

DERVSMGTSSRKPTNSSSSLGALKMSATS\*G 

o/^ocerrrryccT Tr.l riQPPQTPPPFPfrl TTARNS 
SGSESrTPrrLl uLry or r o 1 i^lvcLrvji-. i i rvivi^o 

TTLTRDC 


12 


1362 


A 


177 


12 


416 


' LIPSEPALDSLVDPRVRSRKQPFVIYPVYUTAl 
DTKIHFSLLDGNVGEPDMSAGFCPNHKAAM 
VLFLDRVYGIEVQDFLLHLLEGGFLPDLRAA 
a ct tyt/aitt^ A\>rnPT 1 FT! C\ MMFFF IYPFI 
NIXTMNVY 


13 


1363 


A 


249 


535 


105 


" WTFHRHLSPAPLIVCDQGTCVVSYYPQNlvg 
MPDTQMEQGLN/HLFLDGNA*PHSVECYCPS 
tttct a tv ttqfvt VPHR YP APEVLLRSSVYSSPI 
DVWAVGSIMAELYMLRPLFPGTSEVDEIFKIC 
QVLGTPKKVSTLVPKLL 


14 


1364 


A 


254 


572 


201 


" " YLLTXJGNLMMLLVINADSCLRTXM*FFLGH 
rnrT mrv<5QVTAnnAAFFPVS*KPILVWGYTr 
♦SFFFIFSWGTNGCLLSAITY ACYAAICHPLLS 
TMVMNRPLCTATVNATNKMGFLNSQVN 


15 


1365 


A 


257 


425 


68 


" " THAKFLKKKFWKLVIU^^ 

ATPPT T prnnNTT\KLICEKPKNLAKNI*KRRV 
TFTPIET*HPVKQMIKWQ*LTAWLRNRGYKKI 
v OTPWQFT AP^VCRNLVFDKCG 


16 


1366 


A 


263 


104 


481 


" FCIFRTTEEDRGGDDCVVSVWTKQRNNSCVK 
SKDVFSKPVN1FWALEESVLGVKARQPKPFFA 
AGNTFEMTCKVSSKNIKSPRYSVLIMAEKPV 
GDLSSPNETKYUSLDQDSVVKLENWTDASRV 


17 


1367 


A 


298 


68 1 208 

i 


RKRTNKPIKLDKJ^EHFKNb^ 
VSSLAMKEMLTKTTM 


18 


1368 


A 


300 


904 


J 1 


LWGITGTRHHARVIFIFLVETGFPHVGQAGL 
ELLTSGDPPALASQSAGITGMSHCARPKGHFG 
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SEQID I 
NO: of 1 
nucl* 
eotide 
seq- 
uence 


SEQID 1 
^0 : of 1 
peptide 
>eq- 
jence 


viet S 
lod I 

1 


>EQ I 
DNO: t 
n i 
JSSN 1 
39/496 ( 
?14 


>redicted 

>eginning 

lucleotide 

ocation 

rorrespondi 

ig to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, 
>Aspartic Acid, E-Glutamic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
[=Isoleucine, K-Lysine, l— uutmc, 
M=Methiooine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S= Serine, 
T-Threonine, V=Vaiine, W-Tryptophan, 
Y-Tyrosine, X-Unknown, *=Stop codon, 
/^possible nucleotide deletion, V=possibIe 
nucleotide insertion 














IHLK*MFYTMSQKMP*P riNLILLLIIPGNLNli- 
vTixTMawT r,PKTAFV*KDEVLSGIPFAKGRCR 
WK'DY'OLQEVTDPIMEKGKKKKRTASFFK 
GQPHQSTNALLRRCVR*RYHLS\TVETAGLP» 
KMTGHIPOQPFLFKLVFKC*NVICI»*QYKW*Q 
N1GVKNKSFCPH* SSSPSL'FIGHHSRNF/CSFK 
TEPH SVVQ AGGQ W KN i^SoL^ /vr r rvjuivir i^orv 

ISLMSSWDYRRPPQ 

"NSPSRWAKIQMFEHTFCG*GCG/ER/NVHlHCS 


"79 


T369 


~A 


"302 


1 


"445 


WICRLRPLLWRAVREYLSKLKNAELSFDPGV 
S LLRI Y AIDMPTS I* DEKJE ALLF AFL AFHE * HC 
KSRIW AV1Q/CML WD WLRKL* CFHRMKFYA 
AV*NKPRHLLSHIWKDVONILLK 


20 


1370 


A 


304 


1 


1339 


" FFFCGKEVPLFEQNKHPGPRATTSPU A7HAKA 
LLSAGEFTAG VGLSP* AIHSFVWLCTFIQHG A 
GGPCHQPGGSPGPWMHTTQAGHLWEGAYPG 
GSSTWHQVPGQLGGSWGPRERSLLGSFIKCSP 
CPHPPGFRLWMSPNQKPPTENPGVMGRVWR 
LMPGESPLIWEAEGKEDHLSPEGQGHSE/PVA 
PLHSSLGNTVKP*PKNQKPKQNRSRHGQ\GF 
MAGOGQSRPAAR^PPCPALTPASHSAGTWPP 
RJCRTVPGGPCPSPSGFRSCRR*GFSA* lJOWf 
DAEPPSTPDTAPRCCTQSDTSSQGPQ*S*WRR 
CRALPGRLCSAPAAGLRRARPRLSESRRGNSP 
PASPAAASARCPSWGPSCPARPPSRPAAGTEP 
AAPSRCTAWLRGEREPGPRPPGRRPRSGRGP 
VSFAPEVLSLPAVRQTKSWRWRNEEEITRPW 

A I .VP SR CrCi 


21 


1371 


A 


326 


799 


1587 


GSQ VLPPPPSQDS ATLPQD A* GPRAAFUQr' v u 

E*GLQGAGVRRLRGEVLCQPQP*GAL*EQCLP 

HLSFSPRQGAAPDTEPSAV/GPAPTGATGPULr 

LRHVRLFSAGAPRGAATPCPPALLHGPAWPP 

ARPMFRGHPPVRPLGPWGKVAAGPRALCLA 

GVPAVQGECATKPSG*GL*PAHLRGPPGPEVL 

QWHWQLSAGRDPVPAEDPPL* EGPLGPGGPA 

AAQAEPGADPEPEDKDQAAES RPAGAMSL S A 

OGSGPVGGQGLR 


22 


1372 


A 


327 


146 


652 


" PHLENPHPEHSFPGAPLT*S'TLSWSILSPRtPSP 
GAPCYPGHPHLENPHLEHLLTWRTVTWSTLL 
PGAPCYPEHPHLEHPLTWSTPHLEHPSPGEPL 
SCRTPTRSILHRDHPLP*CLSTEESPI*GWGSLP 

APPSTPLVLDVAPr Orvr A£>2>V^r\Jivi>»o^ i o v i 
rtTVVSP 

— .oooA^TnyDrm tnAMkTMKn^PTI F.KTKS 


23 


1373 


A 


348 


397 


2 


CIVSSCQGTFJvPLnLiiUANrviiNrwv^ 1 ^^^^ 
LQESL*VKQ*LIVAEKYVQ1LHPRKKYFQRPL 
NNEKRKMKKRKEEKJCXCRERMQRRSKWRR 
T-r-vi^T?*DDrrp\PFPk r ^FKFr)RK£RRKETSPRG 

SPPTTPO 


24 


1374 


A 


362 


170 


352 


GRALDTAAGSPVQTAHGLPSDALAPLDDbMf 
AirrmrTAnu/QT T-TPl^R HI ARTLLVSRVRGPQ 


25 


1375 


A 


384 


373 1 128 

i 


YLITTILETG YL WKKRHSDQ* KRTEN F tKLX^n 
KYPKVDFCKSNSMKNRLCNKNV^WTHWIFTD 

KKINLNLKPHTKLTPNIKKN 


26 


1376 


A 


397 


383 


i 165 

i 


- F vKNTNPFIFSGTNLTIWIRSI»RKSDEINgRTK 
•MEKYSISlJ^RRLNTVKMSFLPNLrYKFNTISI 

KIPANF 


27 


1377 


A 


406 


103 


380 

t — 


" KSKATG\TvtVNl*KLIVVFLYANDEQLElbMNN 
rVT\FNGSKNKJAFTNLTKYQNlQNRHAENYKl 
LVKKJEDLNK WRNVLLSWI GRRNTINTMT 
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SEQ ID 
NO: of 1 
nucl- 
eotide 
seq- 
uence 


SEQ ID I 
VJO: of 1 
peptide 
seq- 
uence 


S4et 
lod 


SEQ 1 

[DNO: 

in 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine C=CysteLne> 
D-Aspartic Acid, E-Glutamic Acid, 
F=Phenylalanine, G=Glycinc, H=Histidine, 
i— fcni<»i vinf* V =1 Vsine Leucine, 

1 — ISO leucine, iv i_< y iv., ijvuviuw, 

M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y-Tyrosine, X-Unknown, *=Stop codon, 
/— r»nc«iK1<» nnrlentfrie deletion. V^TJOSSiblc 
nucleotide insertion 


28 


1378 


A 


408 


14 


427 


TICmKJ^LDEIK^LEIvHKLSKLTQEEVENL 
m VTQRFTFT VTNK*VIPHKEKPGPDSFTGEF 
YQTFKEEU1IALHKLFQTIKYGRILPNSVYETS1 
TLKPKPEKDL\KENYRPLPLSNIDAK\LNKTLA 

jNrvi JTllxv 


29 


1379 


A 


434 


395 


128 


1YSK^CMERQRLNN*ILKKNKVRGIAWDVK 
VYYKPTVIK/TSWIL*KDShTV^WNRLENLEID 
PNAKRLILDKGAEATEWRKDSFFRQWQ 


30 


1380 


A 


455 


2 


228 


FFFETESHSVl , QAGVQWCNPGFKRFSCl'ULbS 

owrrwrD \r AT>PPP\ ATSTPN^FT VFTCtFYYVAOAGL 

ICLLSPGDLPALAS 


31 


1381 


A 


462 


393 


2 


QLMFDKGVKNIHXWGW 1TPFTK* Y WKN Wlbl 
CRRMNLNPYLSRYIKINSR\KDLTVRPEP1KLV 
T-T-viT/ii/TrnnTriT r;?<r*PTATCTSK AOSTKTNK* 
KRQTR YTKLKVKKSTASKENNR VKRQPLE* EfC 
TFAN 


32 


1382 


A 


474 


125 


471 


VKP YEIA VFLVKPIE YX^HLLSDPAIPLbGJ 'i.K 
EIKAYT/RJUCTPMFAAPVSVlAAvN*KQSK/CQ 
KQ*YVHRMEYYTTIK^SEILICTTTWVDFRNT 
TLRETDRIHKTTYDV1SLI 


33 


1383 


A 


488 


1825 


2 


KSACSFICSEEQPASPSPLKPGTYASEIWKUI' 

HAAGPRRDSSEAETRRPRGA/DGSGTVVKGT 

PGSP APPCS WGHGG\ETEGAG* CP AAPGTDLR 

APGGSAGS*\GLPSAGGSRGRKGWRAAGRQP 

STR*GRPGRHGGRGE*AGHPEPRQSALQSAG 

L/ASSPEPMGAALAEDGSGDSRGAGPRPQE*P 

PSVLSRS\GS*G*G»AASGTASSPRSHSSRLGPP 

SAGFHGLRCGQPPFAAAPPGPWPGTGRPAGG 

AGSPP AAAGTAPPATRGAQ SRRQNRTAGRN A 

SPQTAAGAGSPVQWALSRATG*TGETGSWC 

AGGTHQATHLTAAWVCPPTWSVRPGGSGPA 

AGLGR*GRHPAQSPPLPVPRG*PAWPQEAPSP 

SPASSEVALSSGSCWPDQAPGPARGSPPAPLA 

P A WPAAGRCjRQK* OKi^o /vrirr r jviv oiav 

SGTS*WRRSP*AGTRTQQC*SPWLVPACSSRP 

L*RGTRRPSTQQSPQTTGTPGRSAGPGHPRS* 

GGRSPAGTGHLGAQTVASPH*GHWPTALSCL 

WASASPPGPEAPPQTGAC1GTNCRYRAASAR 

t? cqvapaPA* n WO * AGSPPAVLRGPP* RVRER 

GALTHRPRAPDE 


34 


1384 


A 


497 


422 


2 


" APGASVGRAQAAEG*RGGPTGRPPSALGVS/E 
AGRAGRAGEGRPVPPAYPLCKSAQTSGPPKA 
dt q\ppt A^rnnRGPPGGAACATCAPPAGPAR 
SSRCRRRSPPE*GPR*PSRPARPSPGSAASRRQ 
KLTPCRCQFRGLCA 


35 


1385 


A 


509 


156 


475 


- ~pf p YPGE* Q AAFLLRGPGLRPPA/DPSLK/HKJN 
LTELWAVTDENIVGLFAALLAERRVLLTAS 
KLSTLTSCDHAFCALLYPMRWEHVLIPTLPPH 
LLDYC* CPPLPRT 


36 


1386 


A 


512 


3 


1631 


' FFFSFVCHLYCVSKfPGPHGRLATWL/PGLLA 
FLGLAAGGQTLCPAGELPGHARAQASGAPGS 
VLIAVPGRRRVHTCGPGPAAPSTRGECPPPAL 
GHTRP ARPRP V\PF APA VPQEPGGQGHG AA/P 
PATGHSAPRGCPPARAAPTGSATPAP PPAACA 
AFH S A W S VPP AGRQQG * RVP AP AFRRTTPGT 
PGQHLLDRPGAPPAQGSGPAPAPPPRLAGPA 
GPAAPPPGPPAASWHSSLSKSSSSIAGWSPPLP 
VGPGSLQ*TPPPQGPHLSGSCGGTSSWRGQR 
AAVARRLRSWNACGLSRVAGRSSASYPGRE 
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SEQLO 
NO: of 1 
nucl- 
eotide 
seq- 
uence 


BEQ ID 1 
SO: of 1 
peptide 
seq- 
uence 


Viet 


SEQ 

DNO: 

in 

USSN 
09/496 
914 


Predicted ; 1 
beginning i 
nucleotide j 
location i 
correspondi ; 
ng to first j 
amino acid j 
residue of 
peptide 
sequence 


Predicted end 
nucleotide 
ocation 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A= Alanine C=Cysteine, 

0- Aspartic Acid, EKjlutamic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 

1— TcrkifMirmp K=I vsine L = Leucine, 
M=Methionine, N-Asparagme, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T-Threonine, V= Valine, W-Tryptophan, 
Y-=Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, \=possible 
nncl^tirif insertion 














GRPSQSQ*PAGPPGMRGCCLRGWPSSSUSU 
rpppUPAQTWl RAGKTGPSPPACGCA*LPPPS 
VSAAPQSPRTRCPRGCAAAAGLCVLAAAGAS 
HG A\GLP G VRVHTQR VHIH* GAG/ GCQTPRPR 
LRSLPNHLGLPAPRCPVSAHPWHRRSGSSCHA 
ARLVPRHP APGCP* ♦TG*\PLITGFPEP* A* GLP 
NHQAVGLEASGALQAGHRDELPTMVQLLDH 

SPDYPLKGRPHAP 


37 


1387 


A 


620 


828 


1 


FRLPLAAGA/RGAAEPRVAVSMAPDPSAKJH 

WEASPEMQSKCHQKGKNNQTECFNHVRFLQ 

r»T vTCTin v Arr.TUAPnPT rA A TD AF AFTLPTS 

FEEGKEKCPYDPARGFTGLIIDGGLYTATRYE 

FRSIPDIRRSRHPHSLRTEETPMHWLNG*EDE 

AQDDGG*GTISSFLLPWPADHPTPKSPGEPVH 

SIPVCCQVRGQPQSGGKESPACLKSLSNCLTH 

RATEKESGSFTQSRSSHRVARGIPPL 


38 


1388 


A 


739 


1 


427 


nFRAMVSSTLKLGISILNGGNAEVQ/QCiNKUKO 
TSEEGKEG * E VP V*LP V SPPLPRPLQKMLD YL 
KDKKEVGFFQSIQALMQTQGEKVMADDEFT 
QDLFRFLQLLCEGHNNDFQ>nrUvTQTGNTn' 
INniCTVDYLLRLQESI 


39 


1389 


A 


767 


1 


1030 


TLDLTGPLLLGGVPNVPKDFRGRNRQFGUCM 

R>n.SVDGKWDKlAGFL\NNGTREGCAARRN 

FCDGRRRQNGGTCVNRWNMYLCECPLRFGG 

KNCEQGEWPASSIPPVTAAWEALLLDVPGTT 

VRGLHI Q VRQPL V VY AAFTVDSHRPLQETVL 

tnt. a a n a cr>\ro<2T> c rz\rr. wtyr* A (TP A FPSPSTP 
RJ^APAPASGVrbrbtj VU WDK AurAcrjrjn 

ATVI1SVPWYLGLMFRTR\KEDSVLMEATSGG 
PTSFRLQVTGAPCHQGTC*VGARGRDPMLSG 
LRVTDGEWHHLLELKNVKEDSEMKHLVTM 
T1 nvr\/nnv<:WHi VTT I WG*TLPPAOGKTGA 
SEDKVSVRRGFRGCMQVRGGCGGRGEACPS 

s~\ a a nn t 


40 


1390 


A 


801 


69 


399 


■ i^nHKEDLNKWKYILCSGNffiRl^lVMIPVV 
PQIIYKFN A* QWILKFTW* E*G AKITILRKNKL 
RGL VL VPL STC* VKYLLDKVLPHIKTYYE AR 
VNKSWLVQVTIM 


41 


1391 


A 


835 


7 


195 


" cj/n vPTJ^VFOFPSri FFOYITWLGPPYHVLFD 
SSVTWS1GAK*D1LQSV^CLYA1CRIPCVT 


42 


1392 


A 


841 


1 


415 


GSTHASGYDKTPDFILQVPVAVEGHllHWLbb 
v a qpnnFr<;HH AYUiDOFWSYWNSLKHRTW 
QGIGTYASNLSQL*TLNAPFPELLLFRSLARTG 
FVLT*\RFGPGLVIYWYGFIQELDCNRERGILL 

KACFPTNIVTL 


43 


1393 


A 


845 


358 


92 


" "PALSPAPVPQKKGSPLPLDPCLGPSSWLLS vu 
t ru/PRi *PRUGPGDPGSLPA1TPLLTPPHTLLP 
ORPMLPPSHAGLARPPPPEPISVP 


44 


1394 


A 


853 


452 


1 


LPQYCFFPRLSPKSKXVKHSAL**PSALJvPPlK 
cppriP^T^T YFTICC/PPALOL/SPIEDPPAIYRS 
PPTHMLRSASQPLNQAPTLVKGHPPSRFLQG 
QVSCPPQPTLPREKPLPLHLRPPPRPAQPPLPR 
PLTFSTRRNVDPEIPERFR 


45 


1395 


A 


894 


379 


162 


GVYPPTVFDNYSVQTSVDGQIVSLNTWDTAU 
QEEYD/RLRTLS*PQTS1FVICFSIGNLEFPIYGT 
WLSMSMGK 


46 


1396 


A 


900 


1 


366 


" TTI<^TLlSNN\ f SSRSLPtLPELKAFSLAh NDr*L 
EIQKYMRT/DQ*CVTHDISLYTvTKLALIFLIPR 
VFLFHQLNIT* * CLHFFTMTTFt AIPF SFLFLGR 
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SEQID 5 
NO. of 1 
nucl- ! 
eotide : 
seq- 
uence 


>EQ ID J 
nX>: of \ 
peptide 
>eq- 
jence 


vlet 5 

lod I 

i 
i 


>EQ I 
DNO: 1 
n i 

39/496 
?14 


Predicted 1 

beginning 

lucleotide 

nratinn 

;orrespondi 
ng to first 
amino acid 
residue of 
peptide 
sequence 


Predicted end 
nucleotide 
ocation 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Ajnino acid sequence CA=Alanine C-Cysteine, 
>Aspartic Acid, E-Glutamic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
Hsoleucine, K=Lysine, L^Leucine, 
M=Methionine, N=Asparagine, P^Proline, 
Q=Glutamine, R=Arginine J S=Scrine, 
T-Threonine, V=Valine, W-Tryptophan, 
Y=Tyrosine, X-Unknown, *^Stop codon, 
/=possible nucleotide deletion, \-possiblc 
nucleotide insertion 
D/KSLAMLPRLVSNSWFgviLPP 


47 ] 


1397 


A 


944 


162 


2 


QLQNLASRGCL*SQLLRKLRKENRLNPCjCjUU 
HTT-PDRIAIVKKTRDSHCWRGC'EEGAPARC 


48 ZL 

49 1 


1398 
1399 


A 
A 


963 
967 


216 
466 


308 
1 


PRKRESWWGERLP/PRGFPPAAEDAPAPUWK 
GRKHASRTARAHVFHPIRQSIRSPVRGRPGDP 
RAAHTRSAGTRLQCKASRGG*GKGPAPTR*E 

nr>nPCAtJAD1 DAQQnPQl FPHSSP WTPPPP APG 

GGPGSAPAx LrAooOV^oLrrL/oor " i \j 
AAAAOP* *TPRCPAALRAGAHIGRVGRPY 


50 


1400 


A 


973 


45 


421 


EKCIQALDVFVFCYIDHSSHCLMSCD'E/U^A 
LNFMPLEMEPKMSKJLAFGCQRSSTSDDDSGC 
ALEEYAVA^PPGLRPEQIQLYFACLPEEKVPY 
VNSPGEKHRIKQLLYQLPPHDNEVRYCQSLSE 

F 


51 i 


1401 


A 


992 


2095 


194 


- iRjRHEAARSCLGCAAGHVPAPGLRLLFl VKU 
PPGRRGPAAPGCVCY* SGESTFVSHVPQRMA 
WPGSAPPRGFHPLQSQTSPSDTVSSPQLSKEE 
DGPG WEHPLSSSL* SLGQ AGGNH*QPEELAG 
WEPRGPPSLAPSSPT/TMWTALVLIWIFSLSLS 
ESHAASNT5PRNFVPNKMWKGLVKRNASVET 
VDNKTSEDVTMAAASPVTLTKGTSAAHLNS 
MEVTTEDTSRTDVSEPATSGVAADGVTS1APT 
AVASSTTAASrTTA^SSMTVASSAPTTAASST 
TVASIAPTTAASSMTAASSTPMTLALPAPTST 
STGRTPSTTATGHPSLSTALAQVPKSSALPRT 
ATLATLATRAQTVATTANTSSPMSTRPSPSKH 
MPSDTAASPVPPMRPQAQGPISQVSVDQPW 
>nTNKSTPMPSKITPEPAPTPTVVTTTKAQAR 
EPTASPVPVPHTSPIPEMhAMbr l lyi'DrMri i 
QRAAGPGTSQAPEQVETEATPGTDSTGPTPRS 
S G GTKMP ATD SCQP STQGQ YMV/DHH * APHP 
GRGRQNSPSGGAVTRGDPFHHSLGFVCPAGL 
* ELQEEGLHPGGLLNQRDVCGLRNVRG AG A 

«i mT^ a iirr»T T3DTJC1 T PT D PXIOVl PNSFGAIEEIC 

WRE A WPLPRPr LLrlJKJr IN ^ v i*r in oruAUi^i'-' 


52 


1402 


A 


994 


1 


462 


' ESGEFLVSFTLKXFnN^liHrNGMKFFNR/Lil- 
* SHTD1AFYKIQHPFMLKALTKWA*EGT*PDR 
RYLH* SLRLNGEQLKTFPLRSGMR*G/CAILPL 
VLNAMLSIVPAWPAGKTRHEKEITCPLIGQE 
FK*FS*FVGDMNTCVENKKESKKLLE 

- nrvAP a \mcu a nru/QT flTTAIFI AKfiEPPNS 


53 


1403 


A 


1011 


1 


630 


PEV1QQSA iDoisAUiWoivVji i/uiijur^vj*^* 1 

DMHPMRVLFLIPKNWPTHCWRRELESFKEV 

^LMLA^TKDPShRPTAKELLKHKFIVKNSKKT 

e^n TT?t rrwWVMJY AFnH^DDFSDSEGSDSES 

TSRENKTHPEWSFTTVRKKPDPKKVQNGAEQ 
DLVQTLSCLSMDTPAFAELKQQDENNASRNQ 

AIEELEKS1AVAEAAGPG 
- rgTT ^ A »e AFn^TOH/CFMHlLICKLGIDGlCYLN 


54 


1404 


A 


1016 


1 


222 


TIKAIDDRHTVSTILNVEKLKAFL * RSGTRQRF 
PISGSGARI 


55 


1405 


A 


1033 


3 


366 


" HAS VDGDEGSDDVYYY YTPAILRbLgALN l A 
E^HRPEEDRMLSEDPWRPAHMEKGYMPL 
HNIPHTEVIDVTGLNQSHLYQraNKGTPMKT 
OKRAA\LYTWHVLEQLEILRQINQQSHGPG 


56 


1406 


A 


1044 


5 


429 


SVLTLQTl^PSIO J LSNJli<XMDWEVVSKJN^l^t 
DRLETQSRASRSPPVTPNQSQETPVDGKPLAL 
PPNQSQKNTRYHIHYLHLQYYLDRHISATLPIP 
SSSG1PTPLA\0TDALTDLVELILX}QPCSEESGR 

APGTLFLLAL 
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SEQ1D i 
NO: of 1 
nucl- ] 
eotide 
seq- 
uence 


>EQ ID 1 
<iO: of 1 
peptide 
seq- 
jence 


Viet 5 
"lod 


SEQ 
D NO: 
n 

USSN 
39/496 
914 


Predicted 
Deginning 
lucleotide 
ocation 
cones pondi 
ng to first 
amino acid 
residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A= Alanine C=Cysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
F«PhenyIalanine, G=01ycine, H=Histidinc, 
T«=T*n1rucine K-Lvsinc, L"Lcucinc, 
M=Methionine, N=Asparagine, P=Proline, 
Q=<jlutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y-Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 


57 


1407 


A 


1050 


11 


430 


GAY AFETNGFPIMLVLTrDKIEGD V U1AUL Y U 
\^SLPMA7LLRTLVRCTSYIIPVTHVLSTPV 
TCLRRREKDGVIVDVLSDTASNHNGFPVEEH 
ADDTOPARLQGPT1.RSQPMGPLKHKAFEERA 

NLGLVQRRLRLED 


58 


1408 


A 


1058 


258 


419 


T K RR DTP WG ANNRALSCTPLTS L'l LU Al^f b 
PCLGCPTXATCRLYQTTVAWF 


59 


1409 


A 


1064 


3 


425 


KAFSFTTSLIGHQRMH'rUHRPYKCKbCCjK. 1 h 
T/pocci K>moRIHTGEKPYKCNECGRAFSQC 
SSLIQHHRIHTGEKPYECTQCGKAFTSISRLSR 
HHRIHTGEKPFHCNECGKVFSYHSALIIHQRIH 

TGEKPYACKDVGK 


60 


1410 


A 


1065 


204 


419 


"GGPPGPFLAHTHAGLQAPGPLLAPAGUbUUL 
t t t Ai/nncn ADUI I TASWGGK/DPtPTKALG 
EGOEGI pl TV 


61 


1411 


A 


1079 


3 


383 


RHSRAHLCQPFHL VMRDLLQLGQDIPQUCH Y 
LEENHLIHRDIAARNCLLSCAAPTRAATIGDF 
GMARYIYRTRYYQLGDRALVLPRKWMPPEAL 
LEGIFTYNTDSWTFGVLLWEIFSLGYN1PYPGR 


62 


1412 


A 


1080 


1 


859 


VVEFLWSRRPSGSSDPRPRRPASKCQMMtbK 
ANLMHMMKLSIKVLLQSALSLGRSLDADHA 
PLOQFFWMEHCLKHGLKVKKSFIGQNKSFF 
rr \rcvi rpr a cm AT WRNLPELKTA VGR 
GRAWLYLALMQKKLADYLKVLIDNKHLLSE 
F YEPEALMMEEEGMVIVG LL V GLNVLD ANL\ 
CLKGEDLDSQVGVTDFSLYLKDVQDLDGGKE 
HERITDVLDQKNYVEELNRHLSCTVGDLQTK 
CDGLEKTNSKLQERVSAATDRICSLQEEQQQL 

PFOXTFI.TR 

PC r a vuvpTUTr T FicPFTri FCGKAFTSSl'l L i x 


63 


1413 


A 


1083 


2 


615 


HRRlHTGEKPYTCEECGKAFRQSAILyVHRRJ 

htgekpytcgecgktfrqsanlyahkkihtg 
ekpytcgix:gktfrqsanlyahkkihtg\ekp 

YKCKECGKAFKSYYSILKHKRTHTRGMSYEG 
DEC/QRSLN/RSSILSNHKIIHNEEK^LKCEKCE 
KAFNHTS ICCRHKKN 


64 


1414 


A 


1084 


946 


1 


' " KKQDLSSSLTDDSKNAQAPLALTESHLA 1 LA 
SSSQSPEAJKQLLDSGLPSLLVRSLASFCFSHIS 
SSESIAQSIDISQDKLRRHHVPQQCNKMPrTAD 
i VAPn I? FT TFVGNSfflMKDWLGGSEVNPLW 
TALLFLLCHSGSTSGSVHNLGVAQQDQCKISFS 
Fr^WLTTGLTTQQRTAIE\NATVAFFVLQCl\SC 
HPNNOKLMAOVLCELFQTSPQRGNLPTSGNI 
S\GFIR\RLFLQLMLEDEKVTMFLQSPCPLYKG 
RINATSHV1QHP\MYGAGH1CFRTLHLPVSTTL 
SD\^DRVSDTPS1TAKL1SK0KDDKKKK 


65 


1415 


A 


1087 


103 


324 


" PRAFEFVHTEMIVG/RVQNIHLFTLQVLEUKA 
LrTMSVGSSLWSTYLIHVMALP/DRELLKPNA 

SVALHKLSNALV _ 


66 


1416 


A 


1095 


3 


493 


" HETCSVTHTVSFSLPFLNPSHPASTPGH i bNEQ 
PSLVWFDRGKFYLTFEGSSRGPSPLTMGAQD 
TLP V AAAFTETVN A YTKG ADPSKCI VKXTGE 
MVLSFPAGITRHFANW > SPAALTrKV IN rbKbt 
HVLrWQLLCCDNTQNDANTK x £FWVNMPNL 


67 


1417 


A 


1098 


57 


356 


" LKLTSLGFHGVSVVGNLLlSILLVKDKibHKA 
PYYFLLDLCCSDILRSAICFPFVFNSVKNGST 
VyTYGTLTCKVIAFLGVLSCFHTAFMLFCISVT 
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SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 

09/496 
914 


Predicted 
beginning 
nucleotide 

correspondi 
ng to first 
amino acid 
residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


.Amino acid sequence (A=Alanine OCysteine, 
D=Aspartic Acid, EOlutamic Acid, 
F=Phenylalanine, G=Glycine, r^Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T-Threonine, V-V aline, W-Tryptophan, 
Y-Tyrosine, X-Unknown, *-Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 
RYL 


68 


1418 


A 


1106 


1 


1326 


MGKISATGnsfMGTKCSWALVWHLESYDPKH 

YEREGMQDWKTASGQSEEATQQSSQKPQPH 

YTTYQSSSFLKYSSESHLLAWRENSSEGSFQF 

PGRSRARPPRTRQQRRGAAAGPGRGAVRLG 

HPQSAAQPQLRAAARIPESPAAFPAQPRPGSA 

RN SDASGPASL SRTLGRASSPRPPQAPDVTAP 

SPAALAPRAARGGSRAAAEAGAEAEEPLRTL 

APRPTRAAAPPPPPPPPPLPPGAPPPPVRCVSR 

RARAPPWR/PAATGPPPVRPVAPSRKLGSARAP 

APALQIRKGTSSGLPGRGGGSGPGNNLS S V A 

GN WRG S SF A VERPGMAK YQGE VQSLKLDDD 

SVIEGVSDQVLVAWVSFAL1ATLVYALFRNV 

HQNIHPENQELVRVLREQLQTEQDAPAATRQ 

QFYTDMYCPICLHQASFPVETNCGHLFCGSLT 

PNSIW 


69 


1419 


A 


1107 


2 


466 


FDTARLHEFGTSITQIFAVDNREDLQKWN4HA 

FWQHFFDLSQWKHCCEELN1KIEIMSPRKPPLF 

LTKEATSVYHDMSIDSPMKLESLTDIIQKjCIEE 

TNGQFLIGQREESLP/SS/CGPHSLMVTIKWSS 

RKRY/SYPASEPLHDEKGKKRQAPLPPSDK 




1420 


A 


1111 


698 


23 


ALRRLHYVRATKVXFLSFRRPFWREEHIECiGH 

SNTDRPSRMIFYPPPREGALLLASYTWSDAAA 

AFAGLSREEALRLALDDVAALHGPWRQLW 

DGTGWKRWAEDQHSQGGFWQPPALWQT 

EKDDWTVPYGRIYFAGEHTAYPHGWVETAV 

KSALRAAHONSRKGPASDTASPEGHASDMEG 

QGHVHGVASSPSHDLAKEEGSHPPVQGQLSL 

QNTTHTRTSH 


71 


1421 


A 


1119 


2 


385 


QKQTLQNGYLDSSMDE-YLGSLPPELQVSSDE 
PPGPPEQAGLSQFHLEPETQNPETTEEIQSSVLQ 
QEAAAQLPQLPEWELSSTKA\EAPALPSQSL 
EGVHSSTEQKAPAQQLPAFEEILAPLLIHHE 


72 


1422 


A 


1127 


1 


906 


HAQYVGPYRLEKTLGKGQTGLVKLGVHCIT 

GQKVAIKJVNREKLSESVLMKVEREIAIL\RLI 

EHPHVLKLHGVYENKKYFPPDELTSGPSMLA 

QVSPHGKLSARRSWDLLSGFPRYLVLEHVSG 

GELFDYLVKKGRLTPKEARKFFRQIVSALDFC 

HSYSICHRDLIO^ENLLLDEKNNIRIADFGMAS 

LQVGDSLLETSCGSPHYACPEVIKGEKYDGR 

RADMWSCGVILFALLVGALPFDDDNLRQLLE 

KVKRGVFHMPHFIPPDCQSLLRGMIEVEPEKR 

LSLEQIQKHPWYLGGNFIS 


73 


1423 


A 


1128 


1 


802 


LRNALDVLHREVPRVLVNLVDFLNFTIMRQV 

FLGNPDKCPVQQA/MLEPLGSKTETLDLRAE 

MPn^CPTQNEPFLRTPRNSNYTYPIKPAIENWG 

SDFLCTEWKASNSVPTSVHQLRPADIKWAA 

LGDSLTTAVGARPNNSSDLPTSWRGLSWSIG 

GDGNLETHTTLPNILKKFNPYLLGFSTSTWEG 

TAGLNVAAEGARARDMPAQAWDLVERMKN 

SPDINLEKDWKL VTLFIGG NDLLH Y CbN rbA 

HLATEYVQHIQQALDILSE 


74 


1424 


A 


1 139 


60 


ASIC) 


FREPCLLVPGDHQPLREASWLA/LPPIGLWGT 
DSPLCCVEVAIPCNKGAHSVGLKGWLLAQG \ 
VLGMRDTIPQEHPWESTPDLCFCRDPEEIEVE 
EQPAADAAVAKGEF/QGEQIAPVPAUIAAHPE 
AADPAPVHTTAHPKGA 


75 


1425 


A 


1147 


2 


413 


" PFPHQHPQEP\KGSCWPQSALRGQCPGPVLGV 
TTTSDLCSLQVPVSSHRNPLLDLAAYDQEGR 
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SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, 
D-Aspartic Acid, E-Glutamic Acid, 
^Phenylalanine, G=Glycine, H=Histidine, 
t— icnim^inf K=l vsine L—Leucine. 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine > S=Serine, 
T— Thr<*nninp> V=Vftline W=Trvr)toDhait, 
Y-Tyrosine, X=Unknown, **=Stop codon, 
/-possible nucleotide deletion, \=possible 
nucleotide insertion 














RFDNFSSLSIQWESTRPVLASIEPELPMQLVSQ 
nnFC.r,nKVn HOI OAILVHEASGTTAITATAT 
GYQESHLSSAR 


76 


1426 


A 


1155 


38 


410 


pTTQAPAnnnpn t ^FTHPT HANLLCVWRRDVK 

PDCK£IWIFWWGDEPNLV\VQYIMNCMLWK 

KDSGKMAFPMNVGRCVFFKE1HNLLERCLMD 

KNFVLIGKWFVRPYYKDEKPVNKSEHLSCAF 

T 


77 


1427 


A 


1162 


526 


350 


RFPQGLEDVSTYPVLIEELLSRGWSEEELQGV 
LRGNLLRVFRQVEKVQEENKWQSPLED 


78 


1428 


A 


1171 


1 


1293 


MAESASPPSSSAAAPAAEPGVTTEQPGPRbPP 

SSPPGLEEPLDGADPHVPHPDLAPIAFFCLRQT 

TSPRNWCIKMVCNPWEECVSMLVILLNCVTL 

GMYQPCDDMDCLSDRCKILQVFDDFIFIFFA 

MEMVLKMVALGIFGKKCYLGDTWNRLDFFI 

VMAGMVEYSLDLQNINLSAIRTVRVLRPLKA 

INRVPSNmLVNLLXDTLPMLGNVLLLCFFVF 

FIFGlIGVQLWAGLLRNRCFLlibrsr 1 i^^uval 

PP\YYQPEEDDEMPF1CSLSGDNGIMGCHEIPP 

LKE QGRECCLSKDD VYDFGAERQDLNASGL 

CVNWNRYYNVCRTGSANPHKGAINFDNIGY 

AWIVIFQVITL£GWVEIMYYVMDAHSFYNFI 

YFlLLIIVSVRLPGLLCjO^r a 1 Av^orisx^vji-;orr 

GVAAESLLLRGWVLWLPGGG 

— ™ ir^nnrni jT^r>T~\T ~n/^/^D7 nVTV A PX/ri Vf?A Yl M 


79 


1429 


A 


1175 


1 


405 


PNDFFKJJMFPIJLFOuri^oriK/viirHU i i L/i^i 
FLSATHLGGLFPPWPLVEERKLKPKASQQCPI 
CHKVIMGAGKLPRHMRTHTGEKPYMCTICE 
VRFTRQDKLKJHMRKHTGERPYLCIHCNAKF 

V HN Y V L K.N MMK 


80 


1430 


A 


1182 


25 


198 


EMNELSQQLSQQGGRGASQCPSPPAPTLPNK1 
PLCQLQLQRVNTGLPTPPCHPGAGAA 


81 


1431 


A 


1186 


254 


583 


"KTVLDVGAGTGELSIFCAQAGARRVYAVEAS 
AIWQQAREVVRFNGLEDRVHVLPGPVETVEL 
PEQVDAIVSEWMGYGLLHESMLSSVLHARTK 
VVKDGGFFLPXSSELFM 


82 


1432 


A 


1187 


2 


716 


DFVDAARNLPLESTKSPAEPSKSVPSLEuDPKA 

SSFEKSDSLEQPSGLEGEDKPLAQFPSPPPAPH 
GRSAHSLQPKLVRQPNIQVPEILVTEEPDRPD 
TEPEPPPKEPEKTEEFQWPQGSQTLAQFPVEK 
t dpwv t?t r,i AKMAOSSGESSFESSVPLFRSP 
SQESNVSLSGSSRSALFERDDHGKAEAPSPSF 
DMGPKPLGTHMLTV 


83 


1433 


A 


1188 


517 


804 


ESPGLSKVLRTGAFAYPFLFDNLPLFYRLGLC 
WGRGHGCGQEALSTSHGYHLFCALLTGFLFA 
cur PFPT APTtRFDYIGHSHOLFHICAVLGTHF 

0 


84 


1434 


A 


1192 


45 


476 


LGDVGFWVERTPVHEAAQRGESLQLQQLLbi> 
GACVNQVTVDSITPLHAASLQGQARCVQLLL 
a Ar.AnvnAR>JTDGSTPLCECLRLGOHRVCEA 
LAVLRGQGQPSPVHSVPPARGLHXREFRMC* 
GFLFDVGXNLEAHEFHFGEP 


85 


1 A 1 ^ 


A 


1 1 94 


69 


I 410 


KRSEEASAPPFPLGGTGAAPTRASLPEQILLPK 
SCLEARXSQPDEKLLSALHNSRTWN^EPRRSQ 
HRLVSPEVHPGRRGSSPGVAECKLTSAYFRT 
GRSPCPSLPGTTRTNSLL 


86 


1436 


A 


1215 


3 


405 

i 

1 


LPSHTCGNPGRLPNGIQQGSTFNLGDKVRY 

NLGFFLEGHAVLTCHAGSENSATWDFPLPSC 

RADDACGGTLRG/AEWHHLQPPLPLG/ATKN 
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SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, 
D-Aspartic Acid, E^Glutamic Acid, 
F=Phenylalanine, G=Grycine, H=Hisudine, 
T_T cn | p , ir : nP v=t vsine L— Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
o=^htt«minp R=Arcinine S^Serinc. 
T=Threonine, V=Valine, W^Tryptophan, 
Y-Tyrosine, X-Unknown, *=-Stop codon, 
/^possible nucleotide deletion, V=possiblc 
nucleotide insertion 














N ADCT WTILAELGDT1 AL VF 1DFQLEDG YDFL 
EVTGTEGSSLW 


87 


1437 


A 


1216 


226 


964 


GTARFGPMVGFGANRRAGRLPSLVLGVLLV 

\rr\r\n a ttmvw^T^^RHVI 7 OFFVAELOGOVO 
VI V V LATIN I VVOlooivn YJuivv*-' 1 -' rru/^vviy ▼ V 

RTEVARGRLEKRNSDLFAWGHAQETDRPEG 

GRLRPPQQPAAGQRGPREEMM£DDKVKLQNN 

ISYQMADIHHLKEQLAELRQEFLRQEDQLQD 

YRKNNTYLVKRLEYESFQCGQQN1KELRAQH 

EENIKKLADQFLEEQKQETQKIQSNDGKELDI 

xTxi/-^\/\rDif r, KrTPfc r v a FMV A nkTNFFPSSNHIPHG 


88 


1438 


A 


1218 


1 


534 


PEFGTTISCGYLMATDVSRRPSVHKAVEIEQE 
RVKSAGAWIIHPYSDFRFYWDLIMLLLMVGN 
LIVLPVGITFFKEENSPNPWIVFNVLSDTFFLLD 
LVLNFRTGIWEEGAEILLAPRAIRTRYLRTW 
FLVDLISS1PVDYIFLVVELEPRLDAEVYKTAR 
ALRIVRFTKILSLLRL 


89 


1439 


A 


1223 


1 


743 


MGFDEVFMINLRRRQDRRERMLRALQAQEIE 
CRLVEA VDGK VGMLTRbN AAr UKiILAJVile, i 
LVWAPRFVDADNLILNPDTLSLLIAENKTVV 
APMLD SRAAYSNFWCGMTSQG YYKRTPAYI 
PIRKRDRRGCFAVPMVHSTFLIDLRKAASRNL 
VAFYPPHPDYTWSrujJn vrAr oi^K^vvr- v i^m i 
VCNKJEEYGFLPVPLRAHSTLQDEAESFMHVQ 
LEVMVPSSPSSAQSMAWSADHIGLVISYL 


90 


1440 


A 


1227 


2 


349 


NKTSFlFYLKNJWADUMTLTFPFRIVHDAGh 
GPWDFKF1LCRYTSVLFYANMDTSIWLGLIT/ 
YD R Y/ WKVVRHL/WD S WMTGU S FTR VYLLG 
LGARLVWFGKXILAKGGHGGISWL 


91 


1441 


A 


1245 


3 


1937 


LGS S D VRAPQRSELG AESPSRMV ASQ A YNLT 

SALTPILTRSRVLNEEPLTLAGFNSRAPANLSD 

WQLIFLVDSNPFPFGY1SNYTVSTKVASMAF 

QTQAGAQIPIERLASERAITVKVPNNSDWAAR 

GHRSSANSVWQPQAFVGAWTLDSSNPAAV 

LHLQLNYTLLDGRYLSEEPEPYLAVYLHSEPR 

PNEHNCSASRR1RPESLQGADHRPYTFFISPGT 

RDPVGSYRLNLSSHFRWSALEVSVGLYTSLC 

QYFSEEDWWRTEGLLPLEETSPRQAVCLTR 

HLTAFGTSLFVPF^HIRFVEPEPTADVNYIVML 

TCAV(XVTYMvTvlAAILHKLDQLDASRGRAIP 

FCGQRGRFKYEILVKTGWGRGSGTTAHVGIM 

LYGVDSRSGHRHLDGDRAFHRNSLDIFQIATP 

HSLGSMWK1RVWHDNKGLSPAWFLQHIIVRD 

AASKASFRVPTPSVAALLRFRRLLVAELQRGF 

FDKJT1WLSIWDRPPRSCFTRIQRATCCVLLICL 

FLGANAVWYGAVGDSAYSTGRVSRLNPLSV 

DTVAVGLVSSVWYPVYLAILFLFRMSRSKV 

GWGWGPGSTGNGAWASAPCPEPPLSSAAAR 


92 


1442 


A 


1246 


5 


562 

1 

! 


WDEENlLNELNDPLREEIVNFNCRKLVATNlt 1 

LFANADPNFVTAMLSKLRFEVFQPGDYIIREG 

AVGKKMYFIQHGVAGVITKSSKEMKLTDGS 

YFGEICLLTKGRRTASVRADTYCRLYSLSVD 

NFhTVLEEYPMMRRAFETVAIDRLDRlGKKN 

SILLQKFQKDLNTGVr^^QENEILKQIVKH 


93 


V443 


A 


1249 


180 


i 901 

1 


" TVPPPPGGPSPAPLHPKJvSPTSTGEAELfUthJ<X 
PGRKASCSTAGSGSRGLPP\SSPMVSSAHNPN 
KAEIPERR}a)STSTTNNLPPSMMTRRNTYVCT 
ERPGAERPSLLPNGKENSSGTPRVPPASPSSHS 
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SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted I 

beginning 

nucleotide 

location 

correspond! 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
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"LAPPS GERSRLARGSTIRSTFHGGQVRDRRACj 
GGGGGGVQNGPPASPTLAHEAAPLPAGRPRP 

ROEDHLSPGGRGCSEL 


94 


1444 


A 


1261 


3 


385 


"KFSQWGLTKPKLSNASP/WISLVKXLMKKWS 
VTQNLTFREQLEAGIRYFDLRVSSKPGDADQ 
EIYFIHGLFGIKV^TXjLMEIDSFLTQHPQEIIFL 
DFNHFYAMDETHHKCLVLR1QEAFGNKLCPA 

CR 


95 


1445 


A 


1282 


2 


550 


GPRDNPG\EDPRFEIVEHFG1AWFTFELVARFA 
V APDFLKFFKN ALNLIDLMS1 VPFYITLV VNL 
VVESTPTLANLGRVAQVLRLMRIFRILKLARH 
STGLRSLGATLKYSYKEVGLLLLYLSVGISIFS 
VV A YTIEKEENXEGL ATIP ACW W WATVS MTT 
VGYGDWPGTTAGKLTASACILA 


96 


1446 


A 


1294 


1 


1456 


" QLLPPSNRENAGLLVGRCLCSAALRPVGDLIT 
SSGQVAVRNAPQAGSAKAGKGKFQDNFEFIQ 
YFKKFFDANCNEKDYNPVAAGQGQETEVAP 
SIVAPVLNKPNQCPEGYICVKAGRNPNYGYT 
SFDTFSWAFLSLFRLMTQDYWENLYQLTLRA 
AETTYMTF/LV/LVILLGSLYLVTLILAV/VAMA 
YEEQNQATLEEAEQKEAEFQQMLEQLKKQQ 
EAAQQAATATASEHSREPSAAGRLSDSSSEAS 
KXSSKSAKERRNRRKKRKQKEQSGGEEKDED 
EFQKSESEDS IRRKGFRFSIEGNRLTY EKJ< Y b b 
PHQSLLSIRGSLFSPRRNSRTSLFSFRGRAKDV 
GSENDFADDEHSTFEDNESRRDSLFVPRRHGE 
RRNSNLSQTSRSSRMLAVFPANG1CMHSTVDC 
NGWSLVGGPSVPTSPVGQLLPEVnDKPATD 
DNGTTTETEMRKRRSSSFHVSMDFLEDPSQR 
0RAMS1ASILTNTVE 


97 


1447 


A 


1295 


2 


2057 


IQTQLPTKSSQQLRKGGNCVRCKMQMNFIAE 

EVLLKYRTTFyNNNKGPNh^YIEIKAFVHFMl 

NRYLSYGSGPKRFPLVDVLQYALEFASSKPV 

CTSFVDDIDASSPPSGSIPSQTLPSTTEQQGALS 

SELPSTSPSSVAA1SSRSVIHKPFTQSRIPPDLP 

MHPAPRHITEEELSVLESCLHRWRTE1ENDTR 

DLQESISRIHRTIELMYSDKSMIQVPYRLHAV 

LVHEGQANAGHYWAYIFDHRESRWMKYNDI 

AVT1CSSWEELVRDSFGGYRNASAYCLMYIN 

DKAQFLIQE\DLIKTGQPLVG1ETLPPDLRDFV 

EEDNQRFEKELEEWDAQLAQKALQEKLLAS 

QKLRESETSVTTAQAAGDPKYLEQPSRSDFSK 

HLKEET1QIITKASHEHEDKSPETVLQSAIKLE 

YARLVKLAQEDTPPETDYRLHHVWYFIQNQ 

APKKJIEKTLLEQFGDRNLSFDERCHN1MKVA 

QAJCLEMII^fcfc VNLtfc ittw riKiu i ts^r tsx. i i 
MYLnGLENFQRESYlDSLLFLICAYQNNKELL 
SKGLYRGHDEELISHYRRECLLKLNEQAAELF 
ESGEDREVNNGLIIMNEFIVPFLPLLLVDEMEE 

v-T^n a "\7"C7™iX>TD"MP WPSYl HOFMFPHT .OEKLT 
JCDILA VJbUlVlKJNi\VVUo I ljLJv/n.ivicr iu^^uAJ-i 

DFLPKLLDCSMEIKSFHEPPKLPSYSTHELCER 
FAR1MLSLSRTPADGR 


98 


1448 


A 


1304 


118 


453 


SGPSSRAIYLHRKEYSQNLTSEPTLLQHRVbH 
LMTCKQGSQRVQGPEDALQKLFEMDAHGRV 
WSQDLILQVRDGWLQLLDIETKEELDSYRLD 
S1QAMNVALNTCSYNSILS \ 


99 


1449 


A 


1306 


3 


1660 


" CGYFCHTTCAPQAPPCPVPPDLLR1 ALGVHPE 
TGTGT4YEGFLSVPRPSGVRRGWQRVFAALS 
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in 
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correspond i 
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sequence 



1450 



A 1318 



101 1451 A 



102 



1452 



103 



104 



1453 



1454 



105 



106 



107 



108 



1455 



1456 A 1383 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 



918 



1353 



220 



1363 



1371 



1376 



542 



1379 



1457 



1458 



1386 



1397 



190 



445 



410 



432 



396 



432 



Amino acid sequence (A^Alanine C=Cysteine, 
D-Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, OGlycine, H=Histidine, 
l=Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
QOlutamine, R=Arginine, S=Senne, 
T=Threonine, V= Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *-Stop codon, 
/-possible nucleotide deletion, V=possiblc 
nucleot ide insertion , 

dsrLllfdapdlrlsppsgallqvlulrdpqf 

SATPVLASDVIHAQSRDLPRIFRVTTSQLAVPP 

ttctvlllaesegererwlqvlgelqrllld 
arprprpvytlkeaydnglpllphtlcaaild 

QDRLALGTEEGLFVIHLRSND1FQVGECRRVQ 

QLTLSPSAGLLWLCGRGPSVRLFALAELENI 

EWEVPKIPESRGCQVLAAGSILQARTPVLCVA 

VKRQVLCYQLGPGPGPWQRRIRELQAPATVQ 

SLGLLGDRLCVGAAGGFALYPLLNEAAPLAL 

GAGLVPEELPPSRGGLGEALGAVELSLSEFLL 

LFTTAGIYVDGAGRKSRGHELLWPAAPMGW 

GY AAPYLTVFSENSIDVFDVRRAEWVQTVPL 

KJCWRPLNPEGSLFLYGTEKVRLTYLRNQLAE 

KDEFDIPDLTDNSRRQLFRTKSKRRFFFRVSE 

EQQKQQRREMLKDPFVRSKLISPPTNFNHLV 

HVGPANGRPGARDKSP 

SLCVPGPVDTGTFAVMSVMVGSVTESLAPQA 
LNDSMINETARDAARVQVASTLSVLVGLFQV 
GLGLIHFGFWTYLSEPLVRGYTTAAAVQVF 
VSQLKYVFGLHLSSHSGPLSLIYTVLEVCWKL 
PQSKVGTWTAAVAGWLVWKLLNDKLQQ 
QLPMPIPGELLTLIGATGISYGMGLKHRFEAG\ 
PPVAPNTQLFSKLVGSAFTIAWGFAIAISLGK 
IFALRHGYRVDSNQVWVMRDV 



DWPDLFTYPLIGSPKCFQSARPENRMYRR'lVR 
SSHGNHALQEVLPRSGHGTEFTKQKHLEAAD 
HOHPPARMSIFSR 



AHLLMLNL AL\TDLL\YLTSLPFL1H V YASGEN 
WIFGDFMCKFIRFSFHFNLYSSILFLTCFSrFRY 
CVIIHPMSCFSIHKTRCAVVACAWWIISLVA 
VIPMTFLITSTNRTNRSACLX)LTSSDELNTIKW 
YKLILTANLLCLPLVTVTLCYTTIIHTLTHGHAN 
\DSCLKQKARRLTILLL 



CHSTESSSDFILPGDYLLGGLCPLHSGCLQV\C 
SFNEHGYHLFQAMRLAVEEINNSTALLPNITL 
G YQLYDVCSDSANVY ATLRVLSLPGQHHIEL 
QGDLLHYSPTVLAVIGPDSTNRAATTAALLSP 

FLVPMLLEQ 



NSRVEDRS/NMSLWTQNITVCPVRNVTRDGG 
FGPWSPWQPCEHLDGDNSGSCLCRARSCDSP 
RPRCGGLDCLGPAIHIANCSRNGAWTPWSSW 
ALCSTSCGIGFQVRQRSCSNPAPRHGGRICVG 
KSREERFCNENTPCPVPIF 



GLGLLYLIFAAVEGVMRV1GGSNHLAVVLDD 
ID^VIDSIFVWFIFISIJVQTMKTLRLRKNTVKF 
SLYRHFKNTLIFAVLASIVFMGWTTKTFRIAK 
CQSDWMERWVDDAFWSFLFNSLILIVIMFLW 



RPSA 



719 



631 



558 
1 



EDGHGGWSSRCLVDHAEEGHREPWKRLCIW 
QRGGHEIRFAFYFPGHPLLSPQICLAPETPPRG 
CPPVSSLHF1SLQ/RLPRDCQELFQVGERQSGL 
FEIQPQGSPPFLVNCKMTSGTFWTCRTDSRVF 
QNANPSNAAHSEDQPTP 



FFFVTRSHSVAQAECSGVFTAHRSLDLVG5>SN 
YPALSLQSSWDHRHTWL1FAFL 



"RVAISLLCAAIFISFMVQSAGKRWPTGVMLM 
W VLF AFL YS WPI Q ALLPTYLKTDLA YNPHT 
VANVLSFSGFGAAVGCCV/GGFLGDWLGTRX 
AYVCSLLASQLL1IPVFA1GQANVWVLGLLLF 
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F=Phenylalanine, G=Glycine, H=Histidine, 
l=Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
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FQQMLGQGlACHLr'KLlLrfj y ru i jj^a/^vji^j 
FTYNVGALGGALAPIIGALIAQRLDLGTALAS 
LSFSLTFWILRNRRPGKSLVR 




1459 


A 


1402 


15 


387 


VLVALPDTWTSETWTEVLGHRVTLPCLYSb 
WSHNSNSMCWGKDQCPYStjClUiALiK 1 l»um 
RVTSRKSAKYRLQGTEPRGDVSLTTLNPSESDS 
GVYCCRIEWGWFNDVKINVRLNLQRASTT 


1 1 n 

1 1U 


1460 


A 


1421 


3 


350 


"HEDLSSLLTRGSGNQERERQLKKL1SLRDWM 
L AELAFPVG VL ATCA* SLLSC* YCVILFPCbCh 
FFHSPD ALFSLLLLSC YEPS YCFF YYLFFSSSPL 
CLLLASSPFPLFILLASL 


iii 
1 1 1 




A 


1426 


2 


344 


_ FTSTMTKPFEKESEQPA*ATLAFGAQT5>rrAL) 
QCALKPDLSYLNNSSSSSSTPATSAGGGIFGSS 
TSSSNPPVATFVFGQSSDPVSSYOFVNTAESST 
SDSLLFSQDSKLATTS 


112 


1462 


A 


1434 


46 


372 


" TTSWTTSCTRSCT*SGASSGPGW1VK1 1 wwk 
SRRSSQRTCSRACSGAWSRTW*RSS*TSSSSC 
STSCSSSSSRSCGRPGGPLGARGVHITSCLNSC 
MSSSTTSSTTSTF 


1 13 


1 A A3 


A 


1439 


3 


292 


~ HEDIMTHYDRLVDE* ALNAGKQRYEKMiSO 
M YLGEI VRN1LIDFTKKGFLLRGQISEMLKTR 
G1FLTFLLSNFIJVCVLLFYVSFYLFQSCINFVL 


114 


1464 


A 


1463 


i "i 


"396 


KQQAVPEPHSSTTTPQEQEQNWYGQDLLNLg 
QRTKVHLPGHKTGPAVAKDTPEPVKKEFTVP 
ATSQGP*SPFSEEPPLPPSNEEVPPTLPP*EPQS 
EDP*KNA*LICQMHAATTHWQQHQQHQVGC 

OYHGIMQ 


115 


1 HO J 


A 


1464 


291 


2 


AGSYPSMVWSCHWGVTQKRRAL'VYSFEEG 
GRRKCGQYWPLEKDSRIRFGFLTVSNLGVEN 
MNHYKKSTLEILNPEVNPGFFFLTLWKQGEN 

>ivrM 


116 


1466 


A 


1465 


667 


337 


LPPQRPA'TDSYSTCNVSSGFLAGQSHNIHLQ 
YWTKYQVWEWLQHFLDTNQLDANCIPFQEF 
DINGEl^CSMSLQEFTRAAGTAGQLLYSNLQ 
HLKWNGDSLFLCLSLPC 


117 


1467 


A 


1479 


1 


381 


GTSGGPKRVLVTERFPWQNPLPVNRGQA^K 
VLGPSNSFQRVPLQAQKLVSSHKPGQNQKHK 
QLQATSVPHPVCMPLNNTQKSKQPLPSAPEN 
NPEEELASDPNNEESL*RPWALEDFEIGl<PLo 


118 


1468 


A 


1485 


3 


385 


' "^TYLWL*GNPPFYEKNDGGLFELLLKAKlJth N a 
PYVTODMSDSAKHFIRPLTGRDP*KPFPCDQPL 
QHPWreGHTCLDNNTHQAASEPINNNFAESKR 
NLAFLATGVVRHMRXLrMGANLbGh'ur i vz> 


119 


1469 


A 


1486 


1 


398 


' GTTSKHH 4 LARSL1RGPFDHDLKPNAA 1 KDQL 
NIIVSYPPTKQLTYEEQDLGWKFRYYLTNQE 
KALTKFLKWVNWDLPQEAKQALELLGKWK 
PMDVKDSLELLSSHYTNPTVRRYAVARLRQA 

DDEDLLMYL 


120 


1470 


A 


1497 


3 


999 


" MGESPAV'GYFVLAGMNSAGLSFGGCiAURY 
LAEWMVHGYPSENVWELDLKRFGALQSSRT 
FLRHRVMEVMPLNm)LKVTHWDFQTGRQL 
RTSPLTORLDAQGARWMEKHGFERPKYFVP 
PDKDLLALEQSKTFYKPDWFDIVESEVKCCK 
EAVC\aDMSSrTEFEITSTGDQALEVLQYLFS 
NDLDVPVGH1VHTGMLNEGGGYENDCS1ARL 
NKRSFFMI SPTDQQ VHCWA WLKKHMPKDSN 
LLLEDVTWKYTALNLIGPRAVDVLSELSYAP 
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/possible nucleotide deletion, \=possible 
nucleotide insertion 














"MTPDHFPSLFCKEMSVGYANGIRVMSM'rJil 1 
GEPGFMLYIPffiYRWGFTMLSTLVSNS 
a apt t \f/^\i7TMJTT *r tvt *TTJT TFT GRTTCDON 


121 


1471 


A 


1498 


3 


306 


AQFLLVGWDniL L1VL l in i cj^vjiv i i ^^v i 1 
WPNSPDVLNHGCFYMQCLSKDCT1GYVSRE 
ML V AHTHTVEEHTGTHLQ YV S WPDHS VFDD 
SSDFVEFEN 


122 


1472 


A 


1533 


121 


329 


LGLFSFVWrbVLbbJ J KL»r2>L.bi bL>ris.i.un^ i 
WDPGTDTALGWSKQPSQSYTLFES*VGSGYII 

DNFFLA 


123 


1473 


A 


1547 


111 


408 


~DARTTWKPRNGSSGIWPGDGAK*FPAVEQAb 
RGHVEMIEKLTFLNLHTSEKDKGGNTALHLA 
AKHGHSPAVQVLLAQWQDINEMNEKQQTPL 

HVAADRG 


124 


1474 


A 


1555 


1 


745 


MTFDDDDKNTYGVALVWKKFQTQSLRLSDL 

HRKSHLWRGIVSITLIEGRDLKAMDSNGLSDP 

YVKFRJLGHQKYKSKIMPKTLNPQWREQFDF 

HLYEERGGV1DITAWDKDAGKRDDFIGRCQV 

DLSALSREQTHKLELQLEEGEGHLVLLVTLT 

ASATVSISDLSVNSLEDQKEREEILKRYSPLRI 

FHNLKDVGFLQVKVIRAEGLMAADVTGKSD 

PFCVVELNNDRLLTHTVYKNLNPEWNKVFTL 

♦VALVWKKFQTQSLRLSDLHRKSHLWRGIVS 

ITLIEGRDLKAMDSNGLSDPYVJO^KLGHQKY 

KSKIMPKTLNPQWREQFDFHLYEERGGVIDIT 

AWDKDAGKRDDFIGRCQVDLSALSREQTHK 

LELQLEEGEGHLVLLVTLTASATVSISDLSVN 

SLEDQKEREEILKRYSPLRIFHNLKDVGFLQV 

KVIRAEGLMAAD VTGKSDPr CV V bLN nlikll 

THTVYKNLNPEWNKVFTL 


125 


1475 


A 


1556 


57 


509 


GGPAPNSRYAEP'KNbLAMl'AtiAiX-tNT ya 

CGGLDNICSIYNLKTREGNVRVSRELPGHTGY 

LSCCRFLDDSQIVTSSGDTTCALWDIETAQQT 

TTFTGHSGDVMSLSLSPDMRTFVSGACDASS 

KLWDIRDGMCRQSFTGHVSDINAVS 


126 


1476 


A 


1592 


3 


178 


"KSEKSCVSSLAHFGTSCQRDYDAMVKLVii-l L 
EMLPTCDL ADQHN lKr niAr al in k cj\ 


127 


1477 


A 


1612 


1 


497 


' ' TESPLL VRPYLPYITKSELHAIMTAGFS'J lAUb 
VLGAY1SFGVPSSHLLTASVMSAPASLAAAKL 

—.-^.p./Tvy iTr i/vt a k /TV x /TC C/^nQflMT 1 ♦ A AT 
FWPETEKPKITLK.NAMJsMfcMjiJoW»ii#i- 

QGASSSISLVAN1AVNLIAFLALLSFMNSALA 

VA^GNMFDYPQLSFELICSYIFMPFSFMMGVE 

WPDSFM 


128 


1478 


A 


1619 


286 


486 


" ' ccmnskaqesvfknvlcnppalsempdvka 

EDE VDFRA b b 1 Sbb V A V u o iaa i LrisjvirwV^J* 1 ™ 
TQArKR 


129 


1479 


A 


1627 


1 


395 


" PTRGALRY WIFGRFLCNIWAAVDVRCCI A 1 1 
MGLCII SIDR YVG VS YPLR YPT1VTQRRGLMA 
LLCVWALSLV1Y1GPLLGWRHPAPEDETICQI 
NEEPG YVLr o 1 rOor Y LrL/UiVLU v ivjj.^ xv v i t 
AKTE 


130 


1480 


A 


1638 


2 


466 


" DPRVRTKIWRKTTIYEIQDKTGSMAVVGKU 
ECHNIPCEKGDKLRLFCFRLRKRENMSKLMS 
EMHSFIQIQKNTNQRSHDSRSMALPQEQSQHP 
KPSEASTTLPESHLKTPQMPPTTPSSS SFTTC VT 
KDKDIK* LLFNL YSS VEILPEVLHLKT 


131 


1481 


A 


1651 


607 


3 

1 


L AEGGD VFDC VLNGGPLPESRAKALFRQM V b 
AIRYCHGCGVAHRDLKCENALLQGFNLKLTD 
FGFAKVLPKSHRELSQTFCGSTAYAAPEVLQ 
GIPHDSKKGDVWSMGWLYVMLCASLPFDD 
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D-Aspartic Acid, E-Glutamic Acid, 
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t— leni purine K=I vsine I_=Leucine, 
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T=Thrermine V=V aline W^Tryptophan, 
Y-Tyrosine, X-Unknown, *=Stop codon, 
/^possible nucleotide deletion, V=possible 
mirlwitirif insertion 














"TDIPKML WQQQKGVSFPTHLSISADCQDLLK 
RLLEPDMILRPSIEEVS WHPWLAST* * KQWQ V 
LSNKVGGESKPKKKK 


132 


1482 


A 


1656 


150 


48 


" LVAKSLLYCGCLFFLLQLAKNVGNNSFND1M 
fa>jt TSPSPKPTPSSDM*VFLrY*TYFGAWHY 
VDAQ 


133 


1483 


A 


1660 


3 


406 


RKHIKLLIQKLSDVP*ECQNNQL*KLTE1CEK£ 
KJCEFKKKMDDQRPEKJTEA* SKDKSPMEEEK 
TEMIRSY1QEVGRYIKRLEEAQSKRLEKLREK 
HKEIRQPILDEKPKGEG S SSFLSETCHEDTS WF 
PNFTP 


134 


1484 


A 


1666 


1276 


466 


' PGSTHASARITIY*L*IILSNATEVDNNFSKPPP 
FFPAGAPPASSSSSSSSSSPPTVSTAPPLIPPPGF 

nnnn/^ a nnnoi TTyTTT?Qf^I4 QQf; ^ AP AFPY ft 

PPPPGAPPPoLlr lliiovjrloou I L/Mvs/^Jvrt-^r i vj 

NVAFPHLPGSAPSWPSLVDTSKQWDYYARSS 

SSSSSSSSSSSSSPRDRDRER*RTRERERERDHS 

PTPSVFNSDEERYRYREYAERGYERHRASRE 

KEERHRERRHREKEETRHKSSRSNSRRRHESE 

t>o ncUD to UVT-TV V QVT? CV PfiK' F A ft S FP APEOE 

EGDSHRRH lvrllsJvoiNJ\oivnuisx-/\u 
STEATPAE 


135 


1485 


A 


1673 


1 


417 


PTRPVNSSQAFALVTYTLGALGGNLIAHMGL 
GYRYWAGIGVLQSCESALTHYRLVANHVAS 
DISLTGGSVVQRIRLPDEVENPGMNSGMLQE 
DLIQYYQFLAEKGDVQAQVGLGQLHLHGGR 
G V* QNnQKAr JJ i r IN L, A/\ 


136 


1486 


A 


1678 


525 


9 


ANTSLSSAAVSAVSPPPCRTSTATTLPPPMFSh 

FCVFPSPSMSPSPSEFLSC1ASVSRVKSLSSSSS 

GSSSTASSLNFSAIMGSSSATASWVLSTASTPP 

CPSALPSSPAQES*SLAASSSAWPVAGISPSGA 

CTFPAGSASGAAKAPSPSWRCreFRALFSLLD 

SSSLSL 


137 


1487 


A 


1680 


1 


2999 

i 

! 

1 
i 

i 
j 


AHRDEIQRKFDALRNSCTVTTDLEEQLNQLrfci 

DNAELNNQNFYLSKQLDEASGANDEIVQLRS 

EVDHLRREITEREMQLTSQKQTMEALKTTCT 

MLEEQVMDLEALNDELLEKERQWEAWRSVL 

GDEKSQFECRVRELQRMLDTEKQSRARADQ 

RJTESRQVVELAVKEHKAEILALQQALKEQK 

LKAESLSDKXNDLEKKHAMLEMNARSLQQK 

LETERELKQRLLEEQAKLQQQMDLQKNHIFR 

LTQGLQEALDRADLLKTERSDLEYQLENIQV 

LY SHEKVKMEGTISQQTKLIDFLQ AKMDQPA 

KKKKVPLQYNELKLALEKEKARCAELEEALQ 

KTRIELRSAREEAAHRKATDHPHPSTPATARQ 

Q1AMSAIVRSPEHQPSAMSLLAPPSSRRKESST 

PEEFSRRLK£RMHHNIPHRFNVGLNMRATKC 

AVCLDTVHFGRQASKCLECQVMCHPKCSTC 

LPATCGLPAEYVTHFTEAFCRDKMNSPGLQT 

KEPSSSLHLEGWMKVPRNNKRGQQGWDRK 

YTVLEGSKVLIYDNEAREAGQRPVEEFELCLP 

DGDVSIHGAVGASELANTAKADVPY1LKMES 

HPHTTCWPGRTLYLLAPSFPDKQRWVTALES 

WAGGRVSREKAEADAKLLGNSLLKLEGDD 

RLDMNCTLPFSDQWLVGTEEGLYALNVLK 

NSLTHVPG1GAVFQIYIIXDLEKLLM1AGEERA 

LCLVDVKKVKQSLAQSHLPAQPDISPNIFEAV 

KGCHLFGAGKIENGLC1CAAMPSKWILRYN 

ENLSKYCIRKEIETTSEPCSCIHFTNrySlLlGTNK 

FYEIDMKQYTLEEFLDKNDHSLAPAVFAASS 

NSFPVSIVQVNSAGQREEYLLCFHEFGVFVDS 
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seq- 
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SEQID 
NO: of 
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seq- 
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SEQ 

[DNO: 

in 
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beginning 

nucleotide 
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correspond! 

ng to first 

amino acid 

residue or 

peptide 

sequence 


Predicted end 
nucleotide 
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to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A-Alanine C=Cysteine, 
D^Aspartic Acid, E-Olutamic Acid, 
^Phenylalanine, (MHycine, H=Histidine, 
r=Tcnl purine K— Lvsine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=GluUmine, R=Arginine, S=Serine, 
T-Threonine, V=Valine, W-Tryptophan, 
Y-Tvmsine. X=Unknown, *= r Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 














YGRRSRTDDLKWSRLPLAFAYREPYLFV 1H> 
NSLEVIE1QARSSAGTPARAYLDIPNPRYLGPA 
ISSGAIYLASSYQDKLRVICCKGNLVKESGTE 
uud nPSTSRR*PASPLPOYOGORAFLQGRRK 


138 


1488 


A 


1686 


2 


526 


GRPQGPAPG AGSPPESGPGLWAALGCSL V W V 
PLCCLGGAAGRL*ARSGKSGLRRRRAHAGPP 
PGGPCNSCP*CSAPESGGRGPLPGPGTGGVCS 
r u/tr OPOTTART AAAAAAPGP AGRRPPGG A 
PQNGSCAASASQEAAAPPPMCPPGRRWAVAS 
PPETRCPAAPGTRCRRLEAA J 


139 


1489 


A 


1693 


3 


376 


LPSMSNCTSCFRLQSRTES*IRQAGHJLLURNb 
rrcTv a t nr a urpqi rWI VI YFESSHKVDFVF 
IV*CFSTPPGAQMTIMSQACAERCN1MRLVDR 
RWAG1AKGVGTQKIIGRVHLGEQKALGL 


140 


1490 


A 


1704 


3 


376 


ERTNKPIKELIMDGKNL1AATKSLSVAQRKFA 
tict DntrifppPTnnAVTnDFRCIDASLREFSNFL 
KNLEEQREIMVS * EGCKLISQLSRGKKJWTWK 
LVLVEWKHLSLGTWHCNGKMRFPEP 


141 


1491 


A 


1743 


1 


362 


"l rrNK vf v arelscld vhld stgst av v adq 

DKLELELVLKGSYEDTQTSFLGTASAFRFHY 

>ii ay *t-c-t crDi T?^ir<rNnWTsIffDNSTGYLTV 
MAAL 1 liLouKl^Koois.Ji^vj vvi^vji-'i^^ i \j ± i ▼ 

PLRPLTIVKJENm^VPAPNVRGLKWMG 


142 


1492 


A 


1769 


1 


406 


NNPSTLPRGS*PMSPR1TMGRRRQRRREHKSb 
LSLASSTVGPGGQIVHTETTEWLCGDPLSGF 
GLQLQGGIFATETLSSPPLVCFIEPDSPAERCG 
t i r>\/rnt)\n qtmot ATFHfiTMT'FATNfOLLRDA 
ALAHKW 


143 


1493 


A 


1789 


1 


447 


QMLRNGGDQNTVPDYHFADRIRELL*P 1 bug 
KNCIP*DTYLRPSALGNIVEEVTHPCSPGPCPA 
NELCEVNRKGCTSGDPCLPYFCVQGCKLGQA 
cnriADnnn TOVP9SAGFVECYKICSCG0SGL 
LENCMEMHCMDLPTDTSALVR 


144 


1494 


A 


1814 


1 


404 


PGRRFRPRLSQAGTDSGS* VFPDSFPSAPAEPL 
PYFLQEPQDAYIVKNKPVELRCRAFPATQIYF 
KCNGE WVS QNDHVTQEGLDEATGLRVRE VH 
TPV^ROOVFFT FGLEDYWCOCVAWSSAGTTK 
SRRAYVRI 


145 


1495 


A 


1827 


26 


448 


XVEEKHADTWRSXCLSDFFFHAAKXLCXE^N 
PpnATCT QVnnMFGKGNGLTWAEKFOCEGSE 
THLAJXPIVQHPEDTCIHSREVGWCSRYTDV 
RLVNGKSQCDGQVEINVLGHWGSLCDTHWD 
PEDARVLCRQLNCGTAL 


146 


1496 


A 


1828 


574 


333 


' QHEGGDLRRRQLGEIQLTVRYVCLRAASAU* 
qxa a AFT* HHVP ASG ADPYVR VYLLPERKW A 
CRKKTSVKRKTLEPLFDET 


147 


1497 


A 


1855 


1 


372 


" ERLVLTSEHCLVLTLFWPSWTYHTLLLSRQH 
VRRLPKLTHAErTOHLASlMNKLLTNYDNLFE 
TSVTYSMG*HGAPTGSEAGANWNH**LHAH 
YYPPLLRSDTVRKFMVGSQMLAQAQRDLTPE 

Q 


148 


1498 


A 


1879 


568 


7 


LLSALDDKGGTQPSASFSNAPTIVCVTACPAG 

IAHTYMAAEYLEKAGRKLGVNVYVEKQGAN 

GLEGRLTADQLNSATACIFAAEVAIKESERFN 

GIPALSVPVAEPIRHAEALMQQALTLKRSDET 

RTVQQDTQPVKSVKTELKQALLSGISFAVPLI 

VAGGTOVA*AV*RQGISSLHDVQVRTWNS 


149 


1499 


A 


1880 


611 


24 


" GLN SENALSNEAMERGWQCLRLFAERLQDIP ' 
PSQIRWATATLRLAVNAGDF1AKAQEILGCP 
VQVISGEEEARLIYQGVAHTTGGADQRLVVD 
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to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine C^ysteine, 
[>Aspartic Acid, E<3lutamic Acid, 
p=f henylalaninc, OGlycinc, H=Histidine, 
I=lsoleucine, K^Lysine, L^Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
y«Ty rosine, X=UnJcnown, *=Stop codon, 
/^possible nucleotide deletion, \=possiblc 
nucleotide insertion 














IGG ASTELVTGTG AQTT* LFSLSMGC VTWLEK 
YF ADRNLGQENFD AAQKAAREVLRPV ADEL 
RYHSWKEVRGASVTVQALQEIMMAQGMDE 
RJTME1WPVD 


1 jU 


1 son 


A 


1894 


2 


750 


"GRVDFFHTDYRPLIRDSNNYVLDEQTQQAPH 
LMPPPFLVDVDGNPHPTKYQRLVPGRENSAD 
EHLIPQLGYVATSDGEVIEQI1SLQTNDNDERS 
PESSILDGMIRQLQQQQDQRMGADQDTIPRG 
LSNGEETPRRGFRRLSLDIQSPPNIGLRRSGQV 
EG VRQMHQN APRSQ I ATERDLQ A WKRRVW 
PEVPLG1FRKLEDFRLEKGEEERNLYIIGRKRK 
TLOLSHKSDSVGLVSQSRPRTCRRKYP 


151 


1501 


A 


1900 


141 


785 


"GKTlQIQTTMQNKyKTVQKQYKll^MNFJ^A 
MEMQ1KKQFQDTCKVQTKQYKALKNHQLEV 
TPKNEHKTILKTLKDEQTRKLAILAEQYEQS1 
NEMMASQALRLDEAQEAECQALRLQLQQEM 
ELLNAYQSKIKMQTEAQHERJELQKLEQRVSL 
RRAHLEQKIEEELAALQKERSERDCNLLERQE 
REIETFDMESLRMGFGNLVTLDFPKEDYR 


152 




A 

r\ 


1915 ! 


2 


377 


LVRLLDTQRDGLQNYEALLGLTNLSGKSDKL 

RQKIFKERALPDIENYMFENHDQLRQAATEC 

MCNMVLHKEVQERFLADGNDRLKLWLLCG 

EDDDKVQNAAAGALAMLTAAHKKLCLKMT 

OVTT 


153 


, 1503 


A 


1921 


1 


237 


■"AYQSLRLEYLQIPPVSRAYTTACVLTSSAAvgL 
ELITPFQLYFIPELIFKHFQIWRL1TNFLFFVPFG 
FNFLLYMIFLYT 


154 


1 jU4 




1928 


2 


354 


EMVEGGEGKMCINTEWGGFGDNGClDUIKi K 
YDTEVDEGSLNPGKQRYEKMTSGMYLGEIV 
RQILIDLTKQGLLFRGQISERLRTRGIFETKFLS 
QIESDRLALLQVRRILQQLGLD 


155 


1505 


A 


1929 


2 


369 


TEIAKIKMEAKKKYEKELTMFQNDFEKACQA 
KSEALVLREKSTLERIHKHQEIETKEIYAQRQ 
LLLKDMDLLRGREAELKQRVEAFESYQLELK 
DDYIIRTYRLIEDDRINIQISGHWQESP 


156 


1506 


A 


1935 


1 


270 


VTRKLPIFIVDAFTARAFRGSPAADCLLbNtX 
DEDMH QKJAREMNLSET AFIRJCLHPTDNF AQ 
RSCFGLIWFTPTTDLQrLTSSILPSIL 


1 j / 


1 

1 1 JU / 




1936 


584 


305 


ESKVNNEKFRTKSPKPAKSPQS A 1 KQLDQPTA 
AYEY\T)AGNHWCKX>CNTICGTMFDr h 1HMH 
NKKHTOGOFQKSSDFQKEELQQTFLPPERQG 


158 


1508 


A 


1939 


1 


423 


* TTHRLNVT AEPPCTSMPIY WMPDVPHKU i 1 A 
NTCPVDLTDYCAQNGFYCLVYGFLPYGSLED 
RLHCQTQ ACPPLSWPQRLD1LLGTARAIQFLH 
QDSPSLIHGDIKSSNVLLDERLTrKJ-GL'rCji.A 
RFSRFAGSSPIQSSM 


159 


1509 


A 


1974 


3 


401 


HTSTARLLLHRGAGKEAVTSDGYTALHLAAK 
NGHLATVKLLVEEKADVLARGPLNQTALHL 
AAAHGHSEVVEELVSADVIDLFDEQGLSALH 
LAAQGRHAQTVETLLRHGAHINL^bLK^ i^ju 
HGPAATLLR 


160 


1510 


A 


1982 


2 


417 


' KFLKDLEKQYNKEEPHLSEIGSCFLQNQEGFA 
[YSEYCNNHPGACLELANLMKQGKYRHFFEA 
CRLLQQM1DIAIDGFLLTPVQK1CKYPLQLAEL 
LKYTTQEHGDY SNIKAA YEAMKMV ACLrNER 
KRKLES1DK1A 


161 


1511 


A 


1984 


4 


770 


■ RETGSVSLSPSGLEGAESYAVSPILYSSPDVRb 
LWLETLQGQRHSKTGVKSTPGQSAAILMKLR 
SSHNASKTLNANNMETLIECOSEGDEKEHPLL 
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I>Aspartic Acid, E-Glutamic Acid, 
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T=Threonine, V=Valine, ^Tryptophan, 
Y-Tyrosinc, X-Unknown, *-Stop codon, 
/=possiblc nucleotide deletion, V=possibIe 
nucleotide insertion 














ASCESEDSICQLffiVKKJUCKVLSVVTFLMRRLS 
PASDFSGALETDLKASLFDQPLSI1CGDSDTLP 
RPIQDILTILCLKGPSTEGIFRRAANEICARXEL 
KEELNSGDAVDLERLPVHLLAVVFKDFLRSIP 
RKLLSSDLFEEWMGALEMQDEEDRIEALK 


162 


1512 


A 


1986 


864 


501 


LLNSGLFSAPDGSNLEMRLTRGGNMCSGRIEI 
KFQGRWGTVCDDNFNIDHASV1CRQLECGSA 
VSFSGSSNFGEGSGPIWFDDLICNGNESALWN 
CKHQGWGKHNCDHAEDAGVICSSKD 


163 


1513 


A 


2001 


419 


187 


AVDLSIDESSLTGETTPCSKVTAPQPAATNGD 
LASRSNIAFMGTLVRCGKAKGWIGTGENSE 
FGDHNLSTFWHS 


164 


1514 


A 


2012 


284 


597 


SLLCLFPGTSTWCKPIVEETQLYVIVAQLFGO 
SHIYKRDSFANKFIKJQA1EILKIRJCPNDIETFKI 
ENNWYFW AD SSICAGFTTI YK WERETGFY SH 
QSFTR 


165 


1515 


A 


2013 


2 


403 


EDPEELGHFYDYPMALFSTFELFLTI1DGPAN Y 
NVDLPFMYSITYAAFAIIATLLMLNLLIAMMG 
DTHWRVAHERDELWRAQIVATTVMLERKLP 
RCLWPRSGICGREYGLGDRWILRVEDRQDLN 
RQRIQRYA 


166 


1516 


A 


2019 


2 


927 


CCQREGLGLKAWQILLSHGRNGLPGEPASS 

QGLSAASSTPVFHLALQIDSAPDNEDWVEMLF 

NKNMVTERLQNVMVLEQCFSDSSSLYRFLTY 

SYLLAFNVWLLLAPVTLCYDWQVGSIPLVET1 

WDMRNLATIFLAVVMALLSLHCLAAFKRLE 

HKEVLVGLLFLVFPFIPASNLFFRVGFWAER 

VLYMPSMGYCILFVHGLSKLCTWLNRCGATT 

LIVSTVLLLLLFSWKTVKQNEIWLSRESLFRS 

GVQTLPHNAKVHYNYANFLKDQGRNKEAIY 

HYRTALNNNKAWDYLCWRFRKTLTDLP 


167 


1517 


A 


2025 


696 


71 


AAASAASSLTVTLGRLASACSHS1LRPSGPGA 

ASLWSASRRFNSQSTSYLPGYVPKTSLSSPPW 

PEWLPDPVEETRHHAEVVKKVNEMIVTGQY 

GRLFAWHFASRQWKVTSEDLILIGNELDLA 

CGERIRLEKVLLVGADNFTLLGKPLLGKDLV 

RVEATVIEKTESWPRIIMRFRKRKNFKXKRIV 

TTPQTVLRINSIEIAPCLL 


168 


1518 


A 


2046 


2 


366 


HLQVAARVFMPLQAVDSAPKPLKGQAQAPg 
RLQGAARVFMPLQAQVKAKASKPLQMQIKA 
PPRLRRAARVLMPLQAQVRAPRLLQVQSQVS 
KKQQAQTQTSEPQDLDQVPEEFQGQDQVLR 


169 


1519 


A 


2049 


1 


945 


QNLEDREVLNGVQTELLTSPRTKDTLSDMTR 

TVEISGEGGPLGIHWPFFSSLSGRJLGLFIRGI 

EDNSRSKREGLFHENECIVKJNNVDLVDKTFA 

QAQDVFRQAMKSPSVLLHVLPPQNREQYEKS 

VIGSLNIFGNNDGVLKTKVPPPVHGKSGLKTA 

NLTGTDSPETDASASLQQNKSPRVPRLGGKPS 

SPSLSPLMGFGSNKNAKK1KIDLKKGPEGLGF 

TWTRDSSIHGPGPIFVKNILPKGAAIKDGRLQ 

SGDRILEVNGRDVTGRTQEELVAMLRSTKQG 

ETASLV1ARQEGHFLPRELVMFRSQSH 


170 


1520 


A 


2050 


363 


1 


PVATfTf TKTT NSDEHAWISSAKTLCETVKDF 
VAKVEKTYDKTLENAWADAVASKCSVLNE 
KLEQLLQ ALHTD SQ AAP VLPGLSPLIVEEDAV 
ESSSEESLGESKEQLGDDVTKPSSQKA 


171 


1521 

- - 


A 


2055 


139 


675 


" IPSRPWLGRITGLDPAGPLFNGKPHQDRLDPS 
DAQFVDVIHSDTDALGYKEPLGNIDFYPNGG 
LDQPGCPKTILGGFQ YFKCDHQRS VYX YLS SL 
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[>Aspartic Acid, E-Glutamic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 

T T.^liuininfl V — 1 vein P T —I PI 1 PI T~\P 

[=1 so leucine, k— Lysine, l,— lculiuc, 
M=Methionine, N=Asparagine, P=Proiine, 
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Y-Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, \=possibIe 
nucleotide insertion 

nromTAVPrn<;vnnvT? NGK C V S CGTSOKJb 














SCPLLGYYADNWKDHLRGKDPPMTKAFFDT 
AEESPFCMYHYFVDUTWNKNVR 


172 


1522 


A 


2056 


3 


361 


LlQHKSAVEYAQSHLSLVSMCKESHKCbhPK 
MEWKVKIRSDGTRY1TKRPVRDRILKERALKI 

KJEERSGL 1 TDDD I M^tlVlKMuK I WOM^cj\xs.y 

HLVRGKEQRRRREFMMRIRLKCLKES 


173 


1523 


A 


2060 


1 


387 


"GTRILSMQIPFVGFQPIRTSEHMAAAGVFALL 
Q AY AFLQ YLRDRLTKQEFQTLFFLG VSL AAG 
AVFLS V1YLTYTGY lAr W ouKr I ol wyivji a 
KIHIPIlASVSEHQPTrWVSFFFDLHILGCTFPA 

G 


174 


1524 


A 


2071 


74 


443 


" LLMGPKAKXSGSKKKKV'IKAERLKLLQEEbt 
RRLKEEEEARLKYEKEEMERL£lQKlfcK£-K w 
HRLEAKDLERRNEELEELYLLERCFPEAEKLK 
OETKLLSQWKHYIQCDGSPDPSVAQEMNT 


175 


1525 


A 


2083 


139 


486 


"AALTWSQPQEFWPMEMQPIVTDMVrVHW v 
AESSTVGWLCALFRVTHVGVGATGHGVVCG 
RRVLCGLPLPSPAPMPIMSLPEGESRKEREVQ 
RLOFPYLEPGHELPATTLLAFLAAV 


176 


1526 


A 


2092 


3 


587 


EGSVNFKFGVLFAKDCigL 1 jJU^rvlFSNblObb^ 

FQKFLNLLGDTITLKGWTGYRGGLDTKNDTT 

GIHSVYTVYQGHEIMFHVSTMLPYSKENKQQ 

VERKRHIGNDIVTIVFQEGEESSPAFICPSMIRS 

HFTH1FALVRYNQQNDNYRLKIFSEESVPLFG 

PPLPTPPVFTDHQEFRDFLL VICLlNut.NA i lci 

PCI 


177 


1527 


A 


2103 


44 


427 


' gkgqvslegrphrgplclgswwpgsrvpul; 
cdgawlawacwvfgndfpspasaacsallg 

CSVSTACLCVPLCSGSPLAPFRRTAALQEGLR 
RAVSVPLTLAfc 1 V ASLWrAL^LLAKuursL^^ 
RSDLQ 


178 


1528 


A 


2104 


2 


409 


ALQSTLGAVWLGLLLNSLWKVAESKUgvbV 
PSTAASSEGAWEIFCNHSVSNAYNFFWYLHF 
PGCAPRIXYKGolvrbl^yOK i inm i it/wooo^ 
LILQVREADAAVYYCAVEVPNTDKLIFGTGT 

RLOVFPNIQNPD 


179 


1529 


A 


2111 


1 


312 


' TTRSSTRPPSLhVHASAKGGEKEEGUDGH Y L 
ixDTtrcLrrnT vifn^TNJ AMI VFMLKRNTEPKXG 
SYHFDLERLRAAHILFEREQEHLAPGGISMPL 

PPPLPLPACLG 


180 


1530 


A 


2116 


3 


366 


" TSrKRArFTTDVTRSFGWDSSEAWQQHiJVgfc 
LCRVMFDALEQKWKQTEQADLINELYQGKL 

, /r> , n m ct rrrvr nMTD TTYTY1 DFPT VTRPYGSS 
KD YVRSLECAj i buWKIui r uultLj * ijvt i uoj 

0 AF AS WCTFHLTAC VSLHR1HNSTVV 


181 


1531 


A 


2117 


2 


386. 


YGLGAHFGRLF1QAGI>JENDFYDGAWCAUK 
NDLQQWTEVDAJRRLTRFTGVITQGRNSLWLS 
DWVTSYKVMVSNDSHTWVTGKNGSGDMIFE 
GNSEK£IPVL1^PVPMVAJIYIRINPQSWFDN 

GSICI . — 


182 


1532 


A 


2123 


1 


493 


" RTKTDVY1LNLAVADLLLLFTLPFWAVNAVH 
GVVVLGKJMCKnSALYTLNFVSGMQFLACISI 
DRYVAVTKVPSQSGVGKPCWUCFCVVVTvlAAl 
LLSIRQLVFYTVNDNAROTIFPRYLGTSMKAL 
IQIVaLEICIGFVWFLIMGVCYFITARTLNIKMP 

NIKIS 


183 


1533 


A 


2140 


3 


561 

i 


" RQAWHEAFKVRKEILTVICCLLAFCIGLIFVg 
RSGNYFVTMFDDYSATLPLL^LLENIAVCF 
VYGIDKFMEDLKDMLGFAFSR^YYMWKYI 
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SEQID S 
NO: of > 
nucl- F 
eotide s 
seq- i 
uence 


EQID * 
^0: of r 
peptide 
eq- 
lence 


Act S 
t od 1 
i 
\ 
( 


EQ t 
DNO: t 
n r 
JSSN 1 
)9/496 c 
H4 i 


>redicted 1 
>eginning i 
mcleotide 1 
ocation < 
;orrespondi 
ig to first 
imino acid 
residue of 
peptide 
sequence 


> redicted end t 
nucleotide 
ocation 
:orresponding 
o last amino 
acid residue 
of peptide 
sequence 


\mino acid sequence (A«Alanine CXJysteine, 
D-Aspartic Acid, E-Glutamic Acid, 
^Phenylalanine, G=Giycine, H=Histidine, 
[=lsoleucine, K^Lysine, L^Leuctne, 
M-Methionine, N=Asparagine, P^Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V-Valine, W=Tryptophan, 
Y=Tyrosine, X-Unknown, *«Stop codon, 
/^possible nucleotide deletion, V=possiblc 
nucleotide insertion 














SPLMLLSLLIAS WNMGLSPHU Y NA WIEDKAS 
EEFLSYPTWGLAVCASLDVFAILPVPVAFIGR 
R FSLIDDG AGPFCSAAYTTTGCRTP YL 


184 


1534 


A 


2145 


3 


538 


HELTVAAADRGQPPQSSVV1»VTVTVLDVNU 

NPPVFTRASYRVTVPEDTPVGAELLHVEASD 

ADPGPHGLVRFTVSSGDPSGLFELDESSGTLR 

LAHALDCETQARHQLWQAADPAGAHFALA 

PVTIEVQDVNDHGPAFPLNLLSTSVAENQPPG 

TT VTTLHAIDGDAGAFGRLRYHL 


185 


1535 


A 


2151 


2 


671 


LDKXLDRMENYNIFNE YILKQ VAAT YIKLG W 

DVMMPsir,^ VOASYOHEELRREVIMLACSFG 

NKHCHQQASTLISDWISSNRJ^LNVRDIVY 

CTGVSLLDEDVWEFIWMKFHSTTAVSEKKIL 

LEALTCSDDRNLLNRLLNLSLNSEWLDQDAI 

DVlIHVAKuMPHGRDLAXVWFRDKWKILNTRI 

orwT1 ptttiF AFPLlLAFPnLYTAIDNPPLVREH 


186 


1536 


A 

■ .» ■ 


2153 


2 


400 


GPMCDKHSAFAEKFHAGFlDYTVHPLWlil wa 
iji at pnAnnn YTT EDNRNWVDSMIPQSPSPP 
LDEQNRDWQGLLEKLHVELTLDEEDSEGPEK 
EGEGQTYFTSSKTLCGrVPQNTDSLGETGIHIC 

AHDK-SP 


187 


1537 


A 


2158 


227 


442 


FNCFRVASDSFLENSSLLlMlLPLRNATQia-UK 
PGAVAYTCNPSTLGGWGGWITRSGVRDQPG 

OHf OTPS 


188 


1538 


A 


2167 


3 


486 


AHLGGAWLTQRSLGSWAAPGPAKAAKEWA 

CIPQNQKMNIWRMKTSKHLQLLSFVLGAVSP 

AVVWYMMVLQENGYGVEEGIPTLLMAASS 

MDDILAITGFNTCLSIVFSSGCARSSGSRNSKS 

i dtpi r/nrFGCDDSSIFSHLDHSSKWSSTYG 

HSGA 


189 


1539 


A 


2168 


2 


412 


" EFLSSNQITQLPN'l"l'FRPMPNLRSVDLSYMKi. 
QALAPDLFHGLRKLTTLHMRANAIQFVPVRff 
ODCRSLKFLDIGYNQLKSLARNSFAGLFKLTE 
LHLEHNDLVKVNFAHFPRLISLHSLCLRRNKV 

AIWSSLDW 


190 


1540 


A 


2179 


"™64 


"~399 ' 


■ MRXNQNTLLLES^GXXRPY'l^EHAPim^ 
MKADELLRWTTSEPLTLEHEYAMQRTWLED 
AYEOTIVLDAEKJOL^QPGATEESCMVGDVN 
T Fl TDLEDLTLGEIEVL1AEP 


191 


1541 


A 


2190 


1 


469 


" CLDRAAGIRHERNVIYrNETHTRHKUWLAKK 
T 9YVLF10ERDVHKGMFATKVTENVLNSSRV 
QE AIAE V AAELNPDG S AQQQS KA VN K VKKK 
AKRILQEMV ATV SPAMIRLTG WVLLKLFNSF 
FWNIQIHKGQLEMVKAATETNLPLLFLPVHR 

SH 


192 


1542 


A 


2197 


26 


157 


" PSKXGGIRLLLTGTQLYGRFGSAIAPLUULUK 
DGYNGEGREEPY 


193 


1543 


A 


2236 


2 


383 


" EYFPNSIWRSLFSTMDLGDIGFYiYKlLgALS 
YTHSKGIMHRDVKJPLNILCNSPRNKVILADW 
GLAEFYHPMRKYSVHVATRYYKSPEILLDYE 
YYDYSLDIWAVGVILLELLTLKLHVFEGGDN 

EQ 

— f o v n /ptccpp pnnPR SDWVTSYKVMGS 


194 


1544 


A 


2241 


105 


409 


RKGVGKMr 1 bfcGKrOyiiK.ai-/w viom» mv,kJ 
OTSHTWVTVKNGSGDMIFEGNSEKEIPVLNE 
LPWMGARYIRINPQSWFDKGSICMRMEILGC 

PI PHPNNY 


195 


1545 


A 


2245 


1 


672 


MGVASDWTKJUEYQPGSUSMPLFPSIHLfc 1 LU 
GAVSSLQIVTELQTKYIGKGCDRETYSEKSLQ 
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SEQID b 
NO: of > 
nucl- F 
eotide s 
seq- i 
uence 


EQ ID ; S 
40: of h 
eptide 
eq- 
lence 


let S 
od I 
i 
1 
( 
( 


>EQ F 
D NO: c 
n r 
JSSN i 
)9/496 < 
H4 


^dieted 1 

>cginning 

lucleotide 

ocation 

:orrespondi 

ig to first 

imino acid 

residue of 

peptide 

sequence 


Predicted end i 
nucleotide 
location ' 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


\mino acid sequence (A=Alanine O Cysteine, 

D=Aspartic Acid, E-Glutamic Acid, 

F=Phenylalanine, G=Glycine, H=Histidine, 

i__T,,^ij»iininp K=A v^ine L— Leucine, 
i^iso leucine, i\. i~< y " 

M=Methionine, N-Asparagme, P=Proline 5 
Q=Glutarnine, R=Arginine, S=Serine, 
T-Threonine, V«Valine, W-Tryptophan, 
Y-Tyrosine, X-Unknown, *~Stop codon, 
/^possible nucleotide deletion, \=possible 














KLCGASSGUDLLPSPSAA'I 'NWTAGLL VDi>bfc 
MIFKPIX}R(^AKIPDGWPKNLTDQFTrrMW 
MKHGPSPGVRAEKETILCYSDKTEMNRHHY 
ALYVHNCRLVFLLRKDFDQADTFRPAEFHW 
■is r nriOAT AKVnGOPGKSITRQLQEMPVTlQG 

1SLKPS 


196 


1546 


A 


2256 


I 


396 


FRGTPVSGLTNRDTLA V 1KHFREPIRLKT V KF 
GKVINKDLRHYLSLQFQKGS1DHKLQQVIRD 
NLYLRTIPCTTRAPRDGEVPGVDYNFISVEQF 
KALEESGALLESGTYDGNFYGTPKPPAEPSPF 

npppv 


197 


1547 


A 


2259 


43 


594 


OLAIEIGVRALLFGVFVKl EtLDPKjRVIQPEEI 
WLYKNPLGQSDNIPTRLMFAISFLTPLAVICV 
VKIIRRTDKTEIKEAFLAVSLALALNGVCTNTI 
KLIVGRPRPDFFYRCFPDGVMNSEMHCTGDP 

HCFTESGRGKSWRLCAAILPL 


198 


1548 


A 


2275 


3 


404 


TCTTVVVIPRMLVDFLSESKTISLPECAlljNlFF 
FLGFASNNCFIMAAMSYDRYTAIFfNPLQYHT 
_ „ —y.o. /-w a\ /cka a curM VfiFT F^T .CirVTVFN 

lsu:dlntiqhyfcdispwslacnytfyhem 


199 


1549 


A 1 


2315 


I 


375 


LTOMFFIHALSAIESTILLAMAFDRY V AlUhUO. 

wiUvlnntvtaqigivavvrgslfffplplli 

KRLAFCHSNVLbno it vn^ vmisj-^ i ^ 
PNWYGLTAILLVMGXDRMFISLSYFLII 


200 


1550 


A 


2334 


2 


409 


PRVRPQQRlCMSFFFKTELGEKLVTK^Lhb 1 )Jl 
SDDPMLPSPDQLKKKAPFTNKKLKAHQTPVD 
ILKQKAHQLASMQVQAYNGGNANPRPANNE 
EEEDEEDEYDYDYESLSDDNILEDRPENKSCH 

DOLQFEYKbhM 


201 


1551 


A 


2350 


3 


512 


lSWEAQlAEIIQWVSDEKDARGYLgALAbKJvi 
TEELEALRSSSLGSRTLDPLWKVRRSQKLDM 
SARLELQSALEAEIRAKQLVQEELRKVKDAN 
LTLESKLKDSEAKNRELLEEMEILXKKMEEK 
FRADTGKLMLCDSALFEYKYFSNECFYFLFD 

MVTLEAPTEFQIQY 


202 


1552 


A 


2351 


1 


1003 


" PSSYSSDELSPGEPLTSPPWAPLGAFliKrtinLL 
NRVLERLAGGATRDSAASDELLDDIVLTHSLF 
LPTEKFLQELHQYFVRAGGMEGPEGLGRKQA 
CLAMLLHFLDTYQGLLQEEEGAGHIIKDLYL 
t Tvn/nuci vnni R FDTT RT HOLVETVELKIPE 
ENQPPSKQVKPLFRHFRR1DSCLQTRVAFRGS 
DEIFCRVYMPDHSYVT1RSRLSASVQDILGSV 
TEKLQYSEEPAGREDSLILVAVSSSGEKVLLQ 
PTEDCVFTALGINSHLFACTRDSYEALVPLPE 
r m vQpr.nTF IHR VEPED V ANHLT AFHWELFR 

CVHELEFVDYVFHGE 


203 


1553 


A 


2361 


2 1 403 


NNLNCAEPLFEQNNSLN VNFNTQKJC'l v WLUi 
rvT3Pvn9TPt Wl ONF\TULLNEEDMNVIV\ r D 
WSRGATTFrWRAVKNTRKVAVSLSVHIKNL 
LKHGASLDNFHFIGGSLGAHISGFVGKIFHGQ 

LGR1TGLDP 


204 


1554 


A 


2390 


280 476 

i 


" SPSLLPQCLMSLSDLSLSr-APPSHLSPktPol Q 
AGSRLGAMRRCAREMDATPMPPAPSCPSERV 

T 


205 


1555 


A 


2400 


543 


745 


AAV ALRDISWQQPYPMDFYAGS SLOP W 1 vrs 
HGQDRRPHAPGRPARGKVQEGSARPPSAVAC 

EDCSCR 
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SEQID S 
NO: of 1 
nucl- F 
eotidc s 
seq- i 
uence 


>EQ ID * 
<0: of r 
>eptide 
eq- 
lence 


.let S 
tod I 
i 
\ 
( 
c 


EQ I 
DNO: \ 
n i 
JSSN 1 
)9/496 
)14 


Predicted 1 
beginning i 
nucleotide 1 

ocation 

;orrespondi 1 
ng to first 
amino acid 
residue of 
peptide 
sequence 


Predicted end 
lucleotide 
ocation 
:oircsponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A- Alanine OCysteine, 
D=Aspartic Acid, E-Glutamic Acid, 
^Phenylalanine, GM31ycine, H-Histidine, 
i— i«nienrinr K=l .vsme L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R^Arginine, S=Serine, 
T-Threomne, V=Valine, W=Tryptophan, 
Y=Tvrosine. X=Unknown, *=Stop codon, 
/-possible nucleotide deletion, \-=possible 
nucleotide insertion 


206 


1556 


A 


2406 


122 


485 


DLSPDSREDHPQGHRRLLPKRPVRGSLMFUH 
THHPCPVSSTTNDTPDQ[WVSVGSLRMGTGG 
MGANASTSPRCWDLSSGNKKWIIQVPILASIV 
RSRGGLLATGVGGMCACVPRNOPLTGT 


207 


1557 


A 


2409 


289 


418 


TWTl YRHKOOVOHNHSNRLSCRPSQEDRAT 
HTIMVLDKENTLS 


208 


1558 


A 


2413 


64 


492 


VQGTGXXFIAFTEAMTHFP ASPVW AUMi-1- L 
MLINLGLGSMIGTMAG1TTP11DTFKVPKEMFT 
GGCCVFAFLVGLLFVQRSGNYFVTMFDDYSA 
TLPLTLIVILENI A V AW I YGTKKFMQELTEML 
GFRP YRFYF YMWKF V SP 


209 


1559 


A 


2417 


3 


877 


EKERLLDE WFTLDE V PKGKLHLKLb WLTLMP 

NASNLDKVLTDIKADKDQANDGLSSALLILY 

LDSARNLPIRYKTKEPVWEENFTFFIHNPKRQ 

„ r r , ;CA . D rvcmi-jrvr'PT p f NT KVPI SOLLTSEDM 

TVSQRFQLGNSGFNSTTKMK1ALRVLHLEKRE 

RPPDHQHSAQVKRPSVSKEGRKTSIKSHMSG 

SPGPGGSNTAPSTPVIGGSDKPGMEEKAQPPE 

AGPQGLHDLGRSSSSLLASPGHISVKEPTPSIA 

orvrcT DTATHPl POP! POI FNGTTLGQSPLGQ1 

QLTIP 


210 


1560 


A 


2422 


35 


456 


REFAASDLEPFTPTDQPlSPhAl 1 QPSOKRQRA 
AGNPGSLAATIDHKPCSAPLEPKJQASRKQRW 
GAVRAAESLTDLAEPASPQVHETPIDASQTQK 
VEPASKSRFTPELQAKVSHSRERALSTMDATP 

HHAQPQRGEG 


211 


1561 


A 


2431 


1 


764 


RRYSQKLIQHTACQLLRTYPAA'l RiDSSNPNP 
LMFWLHGIQLVALKYQTDDLPLHLNAAMFE 

DLDSMDPAVYSLTIVSGQNVCPSNSMGSPCIE 
VD VLGMPLD SCHFRTKPIHRNTLNPMWNEQF 
LFHVHFEDLVFLRFAVVENNSSAVTAQRIIPL 
KALKRGYRHLQLRNLHNE VLEI SSLFINSRRM 
cTTATQQnNmaSASSMFOTEEPJCCLOTHRVTVH 

fiVPG 


212 


1562 


A 


2436 


1 


411 


" GIRGTTGHLGCPINDDPSL'lLlVSWVMEDKPi 
YIGNGTKKEDDSLTIFAVAKRDHVSDTCGAC 
TDLDHNLDKGYLTVLGEQATPTNRLGALPKG 
RANRTRDLELTYLAERIVRLTW1PGDANNRPI 

TDYDCQIEEHQ 


213 


1563 


A 


2445 


1 


' 1294 


" MSSIGCLWVSRSSQIDGLlAEKSUPbKrHUi 
WLMPELHPKEQILELLVLEQFLSILPEELQIWV 
QQHNPESGEESVTLXEDLEREFDDPGQQVPAS 
PQGPAVPWKDLTCLRASQESTDIHLQPLKTQ 
LKSWKPCLSPKSDCENSETATKEG1SEEKSQG 
LPQEPSFRGISEHESNLVWKQGSATGEKLRSP 
SQGGSFSQVIFTNKSLGKRDLYDEAERCLILT 
TnQTMPOTCVPPEERPYRCDVCGHSFKQHSSLT 
QHQRIHTGEKPYKCNQCGKAFSLRSYLIIHQR 
IHSGEKAYECSECGKAFNQSSALIRHRKIHTG 
EKACKCNECGKAFSQSSYLIIHQR1HTGEKPY 
ECNECGKTFSQSSKLIRHQRIHTGERPYECME 
CGKAFRQSSELITHQRIHSGEKPYECSECGKA 

FSLSSNLIRHQRIHSG 


214 


1564 


A 


2461 


1 


615 


GIPGSTISSSRNIFLEDDLAWQSLIHPD55N irL 
STRLVSVQEDAGKSPARNRSASITKLSLDRSG 
SPMVPS YETS V SPQANRTYVR I Kl 1 EDERKIL 
LDSVQLKDLWKKICHHSSGMEFQDHRYWLR 
THPNCIVGKELVNWLIRNGH1ATRAQAIAIGQ 
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SEQ ID J 
NO: of 1 
nucl- 
eotide 
seq- 
uence 


SEQ ID 1 
^0: of 1 
peptide 
seq- 
uence 


Viet I 
nod 1 


SEQ 1 
ID NO: 1 
in 

USSN 
09/496 
914 


Predicted ! 
Deginning i 
nucleotide j 
location 
correspond i | 
ng to first i 
amino acid 
residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A= Alanine C=<Jysteine, 
D-Aspartic Acid, E-Glutamic Acid, 
^Phenylalanine, G=Glycine, H=Histidine, 
t— Tcnlpnrine K=Lvsine, L=Leucine, 
M=Methionine, N=Asparagine, P==Proline, 
rt=filntamir»e- R=Arcinine, S^Serine, 
T=Threonine, V= Valine, W=Tryptophan, 
Y=Tyrosine, X^Unknown, *«Stop codon, 
/^possible nucleotide deletion, \=possible 
tmrlpntiHp. insertion 














AMVDGRWLDCVSHHDQL>RDEYALYKFLQV 
LFSVYCQLECSKLIL 


215 


1565 


A 


2464 [ 


3 


2932 


GPGVRSSQDGMADVFVHLRTAWPRCSFISOQ 

HGPGRHGRRVCSSQDSMADVTVHLRTAWPT 

CSLISGQHGPGESVSYEDDDIPAPASLLHVNA 

AAPALTNPTAPVLCTAPNNTAQKEKVPSGMR 

QRPAGVR1SSRTPDLTCAVSTHSTVPGVRISSC 

TPDLTCAVSIHSTVPSVCISSCTPDLTCAVSTH 

STVPGVRISSCTPDLTCAVSTHSTVPGVRISSR 

TPDLTCAVSIHATVPGVRISSCTPDLTCAVSIH 

ATVPGVRISSCTPDLTCAVSTHSTVPGVRISSR 

TPDLTCAVSIHSTVPGVRISSCTPDLTCAVSIH 

ATVPGVR1SSCTPDLTCAVSTHSTVPGVRISSR 

TPDLTCAVSIHATVPGVR1SSRTPDLTCAVSIH 

ATVPGVRISSCTPDLTCAVSIHATVPGVRISSC 

TPDLTCAVSIHATVPGVR1SSRTPDLTCAVSIH 

ATVPGVRISSCTPDLTCAVSTHSTVPGVRISSR 

TPDLTCAVSIHATVPGVR1SSCTPDLTCAVSTH 

STVPGVRISSRTPDLTCAVSIHATVPGVHISSC 

TPDLTC A VSTHSTVPG VR1S SRTPDLTC A V SIH 

STVPGVCISSRTPDLTCAVSIHSTVPSVHISSCT 

PDLTCAVSIHSTVPGVRISSRTPDLTCAVSTHS 

TVPGVHISSCTTDLTCAVSIHATVPGVHISSCT 

PDLTCAVSTHTTVPGVRISSRTPDLTCAVSIHS 

TVPGVRISSCTPDLTCAVSTHSTVPGVRISSRT 

„. i ,;cTm T\rori\rDT9^PTPnT TPAVSIHA 

TVPGVHISSCTPDLTCAVSIHATVPGVRISSRT 

PDLTCAVSmATVPGVHlSSCTPDLTCAVSTHS 

TVPGVR1SSRTPDLTCAVSIHSTVPGVHISSCT 

PDLTCAVSTHSTVPGVHISSCTPDLTCAVSTH 

cr\m^\nJTQQUTPnT TP AVSIHATVPSVH1SSC 

TPDLTCAVSIHSTVPGLLTSVSQTSTG 


216 


1566 


A 


2477 


1 


414 


FRTKSYRKGSYRCIVSEWlAEQGNWQbigLK 
AVEVATWIQPTVLRAAVPKNVSVAEGKELD 
LTOmTDRADDVRPEVTWSFSRMPDSTLPGS 
t>\/t a t> t ™?typt VHS<sPHVALSHVDARSYHLL 
VRDVSKENSGYYY 


217 


1567 


A 


2480 


2 


460 


" "CRTLCEGPQRFEEYEYLG YKAGLYEA1ADW Y 
xxnvr vrnHFrVRELATRPGRLSPIENFLPLHY 
DYLQFAYYRVGEYVKALECAKAYLLCHPDD 
EDVLDNVDYYESLLDDSIDPASIEAREDLTMF 
VKRHKLESELIKSAAEGLGXSYTEPNYW 


218 


1568 


A 


2483 


140 


383 


" AFSSPHPSPAPQFPECGFYGLYDKILLFKHUKI 
<;ant T OI VRSSGD10EGDLVEWLSASATFED 
LOIRPHALTVHSYRAP 


219 


1569 


A 


2489 


3 


428 


" SSRLVLLAGAAALASGSQGDREPVYRDCVLg 
CEEQNCSGGALNHFRSRQPIYMSLAGWTCRD 
DCKYECMWVTVGLYLQEGHKVPQFHGKWP 
FSRFLFFQEPASAVASFLNGLASLVMLCRYRT 
FVPASSPMYHTCVAFAWVS 


220 


1570 


A 


2498 


1 


1297 


" MDGEAVRFCTDNQCVSLHPQEVDSVAMAPA 
APKJPRLVQATPAFMAVTLVFSLVTLFVVDH 
HHFGREAEMRELIQTFKGHMENSSAWWE1Q 
MLKCRVDNVNSQLQVLGDHLGNTNADIQMV 
KGVLKDATTLSLQTQMLRSSLEGTNAEIQRL 
KJBDLEKADALTFQTLNFLKSSLENTSIELHVL 
SRGLENANSEIQMLNASLETANTQAQLANSS 
LKNANAEIYVLRGHLDSVNDLRTQNQVLRNS 
LEGANAEIQGLFCENLQNTNALNSQTQAFIKSS 
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SEQ ID i 
NO: of 1 
nucl- ! 
eotide 
seq- 
uence 


>EQ ID 1 
^0: of 1 
peptide 
seq- 
uence 


Vict 
lod 


SEQ 1 
[D NO: 
n 

USSN 
09/496 
914 


Predicted 1 

Dcginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
ocanon 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alamnc C-Cystcinc, 
D-Aspartic Acid, E=Glutamic Acid, 
F=Phenylalauine, G=Glycine, H~Histidine, 
l=Isoleucine, K— Lysine, ^ i^cu^uws, 
M=Methionme, N-Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T-Threoninc, V= Valine, W=Tryptophan, 
Y=Tyrosine, X-Unknown, *=Stop codon, 

/=pOSS10le nUCieOllQc acicuun, yusaiuiw 

nucleotide insertion 














"FDNTSAE1QFLRGHLERAGDEIHVLKRDLRM 
VTAQTQKANGRLDQTDTQIQWKSEMENVN 
TLNAQIQVLNGHMKNASRE1QTLKQGMKNA 
SALTSQTQMLDSNLQKASAEIQRLRGDLENT 
VA1 -nrcmncn^PT \CT\ HWITSOEOLORTO 


221 


1571 


A 


2501 


3 


500 


RVRLNNDGLSPLMMAAK 1GKIGIFQH1IKKL V 

TDEDTRHLSRKFKDWAYGPVYSSLYDLSSLD 

TCGEEASVLEILWNSKIENRHEMLAVEPINE 

LLRDKWRKFGAVSFYINVVSYLCAMVrFTLT 

, ^'mn^m^mi — r\rrwn u? a rVFV m FT 
AYYQPLEGTPPYPYR 1 1 VUiLKLAUcviiLr i 

GVLFFFTN 


222 


1572 


A 


2508 


3 


395 


DAHCQRKLAMQEFMElNERLTELHTgK^iU. 
ARHVRDKEEEVDLVMQKVESLRQELRRTER 
AKKELE VHTEALAAEA SKDRKLREQ SEHYSK 
QLENELEGLKQKQIS YSPG VCblbH Wti 1 
KTHT FKKS 


223 


1573 


A 


2544 


2 


412 


NbPAlISNFSAAWHTlVNBaESMTSLEVJK 
MVDERTD YLTKSLKEKTPPFSHCDQ A VLQC S 
EASSNKDMFADRLSKSIIKHSIDKSKSVIPNID 
KN AV YKESLP V SGEESQLTPfcK.br JSJ 4 ru^n v 
LTHCSLSAA 


224 


1574 


A 


2552 


401 


1 


GASLCFISTAFTVLTFL1DSCRFSYPERP11M.5M 
CYNIYS1AYIVRLTVGRERISCDFEEAAEPVLI 
QEGLKNTGCAIIFLLMYFFGMASSIWWVILTL 
TWFLAAGLKWGHEAIEMHSSYFHIAAWA1PA 

VK 


225 


15/5 


A 

A. 


2563 


724 


1 


MSARKERREKGEEEGEGEKDGDEDbKJbfctKt 
GLGEEEEKEAGKKKKKQEEKEKEKOAV Y 
VARICK^MGGSQRVLEKHWTSFLKARLNC 
SVPGDSFFYFDVLQSITDnQINGIPTWGVFTT 
QLNSIPGSAVCAFSMDDffiKVFKGRFKEQKTP 
DSVWTAVPEDKVPKPRPGCCAKHGLAEAYK 
TSIDFPDETl^FIKSHPLMDSA V±TlAur.r w r i 
KTRVRYRLTAISVDHSAGPYH 


226 


1576 


A 


2571 


J ~449 


3 


' EGVLFVYGKyVGDVMNhEMAAEMAQEVAJP 
TRTVLTTDDISSSPIEDRDGRRGVAGNFFIFKV 
AGAACDRGMSLEACEAVTRKANRRTYTMG 
VALEPCSLPQTRRHNFE1GAEEMEIGMG1HGE 
RGVIREKMMPADAIVDHIMDRIFS 


227 


1577 


A 


2575 


3 


1197 


" VLSDLCLFY YRDEKEEGILGSILLPSby IAJLL 1 S 
EDHINRKYAFKAAHPNMRTYYFCTDTGKEM 
ELWMKAMLDAALVQTEPVKRVDKJTSENAP 
TKETNN1PNHRVLIKPE1QNNQKNKEMSKIEE 
KKALEAEKYGFQKDGQDRPLTKJNSVKLNSL 
PSEYESGSACPAQTVHYRPI^SSSENKIVNVS 

LADLRGG>nOTNTGPLY ltAJJKVi^i^iiNoivivv 
LE0W1KJQKGRGHEEETRGVISYQTLPRNMPS 
HRAQIMARYPEGYRTLPRNSKTRPES1CSVTP 
STHDKTLGPGAEEKRRSMRDDTMWQLYEW 

AnnArWTvnCTI DDUQT! <sSPKTMVNISDOT 

^SIPTSPSHGSIAAYQGYSPQRTYRSEVSSPI 
ORGDVTIDRRHRAHHPKVK 


228 


1578 


A 


J.JOJ 


3 


330 


LPFLGLGSVLPCKjMVMASPEMNPTICSVhbA 
HIVLLFHATTFRRGFQVTVLVGNVRQTAVVE 
KIHAKVRGTWPFISPEVRKEGGLPQTGRELLD 
PTMGIKPHLWVAA 


229 


1579 


A 


2589 


1 


448 


DDKKAQGIKKHVKPTSGNAFTICKYPCCiKbK 
ECVAPNICKCKPGYIGSNCQTALCDPDCKNH 
GKCIKPN1COCLPGHGGATCDEEHCNPPCQH 
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seq- 
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SEQ ID I 
MO: of 1 
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seq- 
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SEQ 
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nucleotide 
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GGTCLAGNLCTCPYGFVGPRCETMVCNRHC 
ENGGOCLTPDICQCKPGWYGPTCSTA 


230 


1580 


A 


2593 


2 


138 


AVTFSWFAYVADITQEHERSMAYGLVCMF1 
LYLLYLLRNAFFLR 


231 | 


1581 


A 


2595 


185 


2 


qopvthft PWPTEEOKLLEQALKTYPVNPPER 
WEK1AEAVPGRTKKACIKRYKVADLRISK 


232 


1582 


A 


2596 


1 


391 


STVTGQPRRLLDTAGHQQPFLELKIRANEPGA 
rDADDBTPTPFPATPI CCRRDHYVNFOELGW 
RDWILLPEGYQLNYCSGQCPTHLAGSPGIAAS 
t?u<?avp<;t I KANNPWPGRTSWCVPTARRPLS 
LI ,YL 


233 


1583 


A 


2601 


184 


403 


LLFSDE1IMAAPLRI ADVTSGLIGGEDGRV Y V 

YNGKETTLGDMTGKCKSWITPCPEEKVNVLQ 

NSIPYWERIT 


234 


1584 


A 


2614 


178 


335 


PLTLCLPENNKPPQADA VPDKELTLPVDS' 1 TL 
DGSKSSDDQKIISYLWEKTQ 


235 


1585 


A 


2616 


2 


896 


DVLEVYGTGVASTRHEMGTLDKHKELEDLV 

AKFLNVEAAMVFGMGFATNSMNIPALVGKG 

CLILRDEVNHTSLVLGARLLGATIGIFKHNYA 

QSLEKLLRX) AVIY OQr K 1 KKA w kjvij^ll v c\j v 

YSMEGSIVHLPQIIALKKKYKAYLYIDEAHSI 

GAVGPTGRGVTEFFGLDPHEVDVLMGTFTKS 

FGASGGY1AGRKARJLSPPACLVPNTGSHSLH 

RLTRDLQMNEAMVALVTDRLQGWNSGEGN 

wrr\T> A-TWErm vnvT PVWSHS A VYASSMSPPI 

AEOIIRSLKLIMGLDGTTQ 


236 


1586 


A 


2621 


1 


392 


NTSSFPAQPSSPARPSLPHLSQHPSNPLLPLAS 
ADHPQCGRFLPLHEPEPLCPSPSLSYPTLVSS 
WSSPFSSHHGCPPGLYPFPTSPKTIQPPGLAQL 
nBor.unm ROAO^MPGHGALSPLLLPP 

A 


237 


1587 


A 


2628 


398 


1 


DLVCKISGFGRGPRDRSEAVYTTMSGRSPAL 
WAAPETLQFGHFSSASDVWSFGIIMWEVMAF 
GERPYWDMSGQDVIKAVEDGFRLPPPRNCPN 
LMHRLMLDCWQKDPGERPRFSQIHSILSKMV 

QDPEPPNV 


238 


1588 


A 


2631 


1 


1104 


WSPCSLTCGVGLQTRDVFCSHLLSREMNbl v 

ILADELCRQPKJSTVQACNRFNCPPAWYPAQ 

WQPCSRTCGGGVQKREVLCKQRMADGSFLE 

LPETFCSASKPACQQACKKDDCPSEWLLSDW 

TECSTSCGEGTQTRSAICRKMLKTGLSTWNS 

tt nppi PP^STPprMl ATCARPGRPSTKHSPH1 

AAARKVYIQTRRQRKLHFVGGGFAYLLPKTA 

VVLRCPARRVRKPLlTWEICrXKJHLISSTHVT 

V APFG YLKJHRLKPSDAG VYTCS AGPAREHF 

V1KL1GGNRKLVARPLSPRSEEEVLAGRKGGP 

KEALQTHKHQNGIFSNGSKAEKRGLAANPGS 

RYDDLVSRLLEQGAPCSSSKKKN 


239 


1589 


A 


2636 


1 


678 

j 


" MKPDNILLDEHGHVHIlDFNIAAMLPRb-igil 
txx Ar.wPVMAPFMFSSRICGAGYSFAVDWW 
SLGVTAYELLRGRRPYHIRSSTSSKEIVHTFET 
TVVTYPSAWSQEMVSLLKKLLEPNPDQRFSQ 
LSDVQNFPYMNDINWDAVFQKJU,IPGFIPNK 
GRLNCDPTFELEEMILKKPLHKKXKRLAKK 
EKJDMRKCDSSQTCLLQEHLDSVQKEFniNRE 

KVNRDCI 


240 


1590 


A 

l 


2639 


389 


3 


ELLDPTTPMRTKCIELLYAALTSSSTDQPKAD 
LWQNFAJ^EIEEHVFTLYSKNIKJCYKTCIRSKV 
ANLKNPRNSHLQQNLLSGTTSPRFJAEMTVM 
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fmaKTKFI KOI RAWTESCIOEHYLPQVIDGTL 
Y 


241 


1591 


A 


2640 


392 


3 


IIU.mRCWMRLATICVLVFTLGSKITSCDDD 
TCDLCGYNQKLYPCWETQVGQEMYKLMIFD 
FIIILAVTLFVDFPRKLLVTYCSSCKLIQCWGQ 

nnr a roriMVT nrwrJOTTPWlG AFFSPLLPAM 
Y 


242 


1592 


A 


2642 


405 


1 


" YFKNTTLLL VG VICV AAA VEKWNLHKRJ ALR 
Nm.MAGAKPGMLLLCFMCCTTLLSMWLSNT 
STTAMVMPIVEAVLQEL V SAEDEQLV AGN SN 

T»i-r; * T? r>T ct t^\/VKTQOP^VTPT IFVWFDn DFT MK 
TEEAErloLUVKjNov^ro vcL-ir v jNci^LL-i^ri^ivirv 

SPLMISQACI 


243 


1593 


A 


2646 


412 


2 


CLAMIKGIQSSGKIIYFSSLFPYVVLICFLIRAr 
LLNGSn)GIRHMrTPKLEIMLEPKVWREAATQ 
VFF ALGLGFGGV1AFSSYNKRDNNCHFDAVL 
VSFINr>TSVLATLVWAVLGFXANVINEKCIT 
QNSETV 


244 


1594 


A 


2650 


1 


1271 


MTTTLIGLLKTARLLRLVRVARKLDRYSEYG 

AAVLMLLMCIFALIAHWLACIWYAIGNVERP 

YLTDKIGWLDSLGQQIGKRYNDSDSSSGPSIK 

DKYVTALYFTFSSLTSVGFGNVSPNTNSEKIF 

SICVMLIGSLMYASIFGNVSAIIQRLYSGTARY 

HMQMLRVKEFIRFHQIPNPLRQRLEEYFQHA 

WTYTNGIDMNMVTNGTC S SCTSDDGHFIL VS 

NHHQGGLIY S WNDAASMQKrFN riIKb&L.iAJ a 

TSDSNLNKYSTINKIPQLTLNFSEVKTEKKNSS 

PPSSDKTUAPKVKDRTHNVTEKVTQVLSLGA 

DVLPEYKLQAPRINKFTILHYSPFKAVWDWLI 

LLLVIYTAIFTPYSAAFLLNDREEQKRRECGY 

SCSPLNVVDLlVDlMrUDlLirSrKl 1 i YiNVncc, 

WSDPASV 


245 


1595 


A 


2656 


385 


2 


NLTWWPLFRDVSFYIVDLIMLHFFLDNVIMW 
V^SLLLLTAYFCYVVFMKFNVQVEKWVKQ 
i *TXTT>x7VTrv/v'\/T apt; ac\ a vp<ja ARDKDFPTLP 
AJCPRLQRGGSSASLHNSLMRNSIFQNKIHTLD 

PHV 


246 


1596 


A 


2660 


200 


506 


\n \n riA/fXTWOMT TrWVT FFKVNFFI AFEGPI 
LLDl^RIKHLDCTNQLSQATALAKLCSDHPEIG 
[KGSFKQTYLVCLCTSSPNGKLIEEVSMFSF1S 

NYFLS 


247 


1597 


A 


2678 


3 


267 


DAWVKNDIIFNQTCRKQKJSENLKrILASVRV 
vnirwT wwfil SORT ADPEVSPLVFFVILIFF 
VSLSYLEirFDPAQLCDSSEHnS 


248 


1598 


A 


2687 


1 


404 


DFTTLAAMMRTLFSLFGD VRSDVHRFS V 1 LF 
GAAIKSVKNPDKKSIENQVLDSLVPLLLYSQD 
ENDAVAEESRQVLTICAQFLKWKLPREVYSK 
nPWTTrvPTFApTTTrRFFEKJCCKGKINILEOTL 

MYSKNPKL 


249 


1599 


A 


2692 


1 


440 


FRRJOIRRRERDCAAQGARRHCRHLAECKJLV 

SFPIGmCVLRNVSGQIHLITLANNELKSLTSK 

FMTTFSQLRELHLEGNFLHRLPSEVSALQHLK 

ALDLSRNQFQDFPEQLTALPALETINLEENEjV 

DVPVEKLAAMPALRSINL 


250 
251 


1600 
1601 


A 
A 


2693 
2694 


459 

1* 


21 
| 404 


LLPGSLGVPILHSQPWDPSPQCPHRAPS'IPRKL 
PPLGALSQALTFLSRAAKNHSQDPGKGTKPFP 
AAPAAPPPRSSLPAPLPMGLKDKGPQPAPPTIF 
NSPWHPATLPGALGPQLSQAAPSPIPPPCLMG 

ISSCPDLKLTKSSTP 
" FVFDLKLRVPGFAALLIHGASSVPGPETVKLK 
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nucleotide insertion 














QKRKKKAPDHSSGRKEELVriHTVDKLETKK 
PVGRVLCGLSGELLHSLLLPRRKTEKRALGSH 
nu AHFPFHPVAPEPLSNSCOISKEGREQVLSEI 

GAGDCL 


252 


1602 


A 


2697 


421 


1 


PQKSHSGAYQCFATRKAQTAQDFAJIALEDG 
TPRIVSSFSEKWNPGEQFSLMCAAKGAPPPT 
\rru/AT nnPPnn^nG^HRTNOYTMSDGTTISH 
MNVTGPQIRDGGVYRCTARNLVGSAEYQAR1 
xrv/D riPP<5 7T? A M"R NTT 


253 


1603 


A 


2698 


65 


401 


ACCQWRRTLIPAKSTTVSCTISTPHHPFRGSYS 
FDDHITDSEALSRSSHVFTSHPRMLKRQPAIEL 
PLGGEYSSDVPRPLSTQLSSSLLGYFSTLMTG 
AAFTNNIASSTIIL 


254 


1604 


A 


2699 


438 


301 


(3QIHSQDDPPFIDQLGFGVAPGFQTFVACQEQ 
RVRGPWEAGPGVGY 


255 


1605 


A 


2700 


1 


842 


~TQNRED S SEGJivKKL VEAEELEEKHREAQ V S 
AQHLEVHLKQKEQHYEEKIKVLDNQIKKDLA 

DKETLENMMQRHEFAEAHfcJ^ 

AMDSKIRSLEQRTV^LSEANKLAANSSLFTQR 

NMKAQEEMISELRQQKFYLETQAGKLEAQN 

PJCLEEQLEKJSHQDHSDKNRLLELF^TRLREVS 

LEHEEQKLELKRQLTELQLSLQERESQLTALQ 

a An a at T7crM DfiA VTP1 FFTTAFAFEEIOALT 

VGLGSNJERLLKASARMSVELALSILAHP 


256 


1606 


A 


2701 


2 


405 


FVGOPGADPPVAVM WDPRAARMDLTA YAE 
LLKESGNQVLKNGNFSLAIRKYDEAIQILLQL 
YQWGVPPPJ3LAVLLCNKSNAFFSLGKWNEA 
r-\7A a is c-z^t r^umPTVATTcrnVYRAGYSLLRLHO 
PYEAARMFFEGLR 


257 


1607 


A 


2702 


2 


399 


FVESASSRPPGCFSGDGRFWLVSEGSRRU WD 

FNPSFSFLDPRYSVGGDEN1GTVTTLAN1LREF 

NPSLKGFSVGTGKETSPNAFLNQAVAGGRAE 

DLPVQARRLVDUdK2SrDTRJOT 

r;r f unT 


258 


1608 


A 


2709 


1 


1097 


" SVGARQGEARDRIRRFFPKGDLEVLQAUVfcKl 
MTRKELLTVYSSEDGSEErTHTVl^KALVKACG 
SSEASAYLDELRLAVAWNRVDIAQSELFRGD1 
QWRSFHLEASLMDALLNDRPEFVRLL1SHGLS 
LGHFLTPMRLAQLYSAAPSNSLIRNLLDQASH 
o aptvadat vt-p-a AFT RPPOVGHVLRMLLG 
KMCAPRYPSGGAWDPHPGQGFGESMYLLSD 
KATSPLSLDAGLGQAPWSDLLLWALLLNRA 
QMAMYFWEMGSNAVSSALGACLLLRVMAR 
LEPDAEEAJUIR^LAFKFEGMGVDLFGECYR 
qqpvp A ART T T RRCPLWGDATCLQLAMQAD 
ARAFFAODGVQSLPTQKWWGDMARR 


259 


1609 


A 


2721 


1 


403 


" VYLGAGPGLFFSNEGAKEGEKANIPKLMLPR 
GGFSQRENfVTGERSPSPEEEEEEEEEGFGERA 
SCRRGLFRVRLTRVGLAAPSKASRGQEGDAA 
PKSPVREKSPKFRFPRVSLSPKARSGSGDQEE 

GGLRVRLP 


260 


1610 


A 


2728 


1 


477 


" LLGGDLRYFlLQQNVriFTEGTVKLYlCELALA 
LEYLQRYmiHRDIXPDKlLLDEHGH\WTDFr> 
IATVVKGAERASSMAGTKTYMAPEVFQVYM 
DRGPGYSYPVDWWSLGITAYELLRGWRPYEI 
HSVTPIDEILNMFKVERVHYSSTWCKGMVAL 

LRK 


261 


1611 


A 


2730 


3 


547 


" LTlTDFILVLYRYYRSPLVQIYFffiQHKJETWR 
ET^QGCrTO > LVSlSP>roSLFEA\nfTL 
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RLPVLDPVSGNVLrnLTHKlvLLKFLHIFviSLL^ 
RPSFLYRTIQDLGIGTFRDLAVVLETAPILTAL 
nrPvnt?RV^AT AWNECGTHPODERLGLGW 
GLGEPGSEERLFPAAITSR 


262 


1612 


A 


2733 


3 


431 


GPEFPGS AKL VFLDLSYNNLTQLGAGA1' Kh>A 
GRLVKLSLANNNLVGVHEDAFETLESLQVLE 
t \m\TKn i>cr cvaat A AT PAl RSLRLDGNPWL 
CDCDFAHLFSWIQENASKLPKGLDEIQCSLPM 
ESRRISLRACRRPASRV ___ 
"^TDTcr- \/riPPVT3 v a TK^nFTslCSFEDNKNWOF 


263 


1613 


A 


2736 


2 


343 


LWGLNGNFNFFKEPWGGRNNHAKGFR1TW 

ARSSSQNrmTFQNNRNFLRLQRDSQKKGQFA 

RLISPLVNLPQSPGGLEFQYOAT 


264 


1614 


A 


2738 


2 


245 


P^lMLKCLREGQPPPSYNVVTRLDGPLPSGVRV 
DGDTLGFrPLl 1 briovjl Y vivri-U i incut o^ivj-^oaa 
DTVDVLDPPEDSGKQVDL 


265 


1615 


A 


2752 


2 


388 


AAGDAPLRSLEQANRTRFPFFSDVKGDHRLV 

• *i - ^ 1/T TCA t/QT i rsxrvfAf VT VARRR 
LAA VETT VL VLlr A VbLlAJiN v vly AVJVIX1V 

RRGATACLVLNLFCADLLFISAIPLVLAVRWT 
EAWLLGPVACHLLFYVMTLSGSVTILTLAAV 

SI RR 


266 


1616 


A 


2755 


192 


1 


AFREVGGYWGLLCEHL Y AlPSK'l SbGMWTAK 
LQGYLPLQDAFHIFQDPLTGDLPWTELILGLP 

V 


267 


1617 


A 


2760 


434 


714 


ASRLEKQNSTPESDYDNTPNDMEPDGMU Y M 
HRTSVPGEGLPKAKUl-AUlAJV^^vr nniri 
LYFQTHKGLKDSSLRSEVTCLGISQCWRKGFF 


268 


1618 


A 


2762 


1 


405 


'lACTTCGQDEWSPERSTRCFRRRSRFLAWUiir 

VQASGGPLACFGLVCLGLVCLSVLLFPGQPSP 
AJ^CLAQQPLSrlLPLTGCLSTLFLQAAEIFVESE 
1 PI SWAE 


269 


1619 


A 


2772 


3 


243 


TRPAEWQYLVLFFVMSHPSQAYDKLSLSUtiL 
LIAVLNLLRREVSEHGRHLQQYFmFVMYAN 

L S KNL S r b br u v o i 


270 


1620 


A 


2789 


1 


486 


ELQSQQACmTKETEQLRSQLQTLKQQHQgA 
VEQIAKAEETHSSXSQELQARLQTVTREKEEL 
LQLSIERGKVLQNKQAE1CQLEEKLEIANEDR 
KHALERFEQEAV AVD SNLRVRELQRKVDGIQ 
KAYDELRLQSEAFKKHSLDLLSKERELNGKL 

T> T TT CD 

RHLbr 


271 


1621 


A 


2795 


I 


1 568 


" J^KJ^VTVQLFTESlQK^QEDKEKMWRKQKb 
FSGSDRGKLPGSEEKNQGPSM1GRKEERLITE 
RKrErlLKNKSAPKVVKQKVIDAHLDSQTQN 
FQQTQIQTAESKAEHKKLPQPYNSLQEEKCLE 
VKGIQEKQVFSNTKDSKQEITQNKSFFSSVKE 
cncnnr.KT.AT TvnVEFLRKREELHQD^STVKQP 


272 


1622 


A 


2797 


8 


523 


" KCMQGKY AG AMESEPC VCTEADFDCD Y U Y b 
RHSNGQCLPAFWFKPSSLSKDCSLGQSYLNST 
G YRK W SNNCTEKj VREQ YT AKPQKCPGKAP 
r> p tvt A nGKl TAROGHNVTLN1VQLEEGD 
VQRTLIQVDFGDG1AVSYVNLSSMEDGIXHV 

YONXGIXRXTVQVDNSLGS 


273 


1623 


A 


2801 


72 


395 


" ' HPSRSNVGPRQLTVVVNTSNLSHDNRRKYIFS 
DEEGQNQLGIRIHQDIPLPPRRRELPALRTTNG 
KADSLNVSRNSVMQELSELEKQIQVIRQELQL 
A VSRKTELEE YH 


274 


1624 


A 


2805 


168 


320 


" iLWLYFETGTWVYPVFAKLSLLGLAALFSLRE 
IFIARNGWGETLTHCKRV 
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F=Phenylalanine, OGlycine. H=Histidine, 
I=lsoleucine, K=Lysme, L=Leucine, 
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275 


1625 


A 


2812 


208 


321 


GSLATCQLSEPLLWFILRVLDTSDALKAh HU 
MGKIIFQ 


276 


1626 


A 


2813 


41 


266 


AORSLHGAGDRAWVGISPTDWSPKWELCK 
KYQQQTVVAIDLAGDETlPGbbLLr\jrt vi^a i 
QVGPVRRNGEAGPG 


277 


1627 


A 


2817 


3 


410 


VLQERLDNFQRKCIQLASSTTEGKVDKLLMJtN 
LFISYLHTPKHKQHEVLQAMOblLOl 1 uthMb 
PLFQEEHGTATRWMTGWLEGGSKSVPKTPL 
GLNQQPALNGSFSELFVKFLKTESLSSTLPTX 
LPFHNSPGKIK 


278 


1628 


A 


2821 


238 


457 


GLSGPSCSCPHSPLFniSKAgLHl ALKWkN Vb 
VKLRLLLHLEELQME1 1DIRHYDLES VPMTWD 
PVDQNPRLV 


279 


1629 


A 


2822 


342 


1 


PLIPANLPAHSNPLQPLPSLPHPFLPATHKFPT 
TPPTFSSVPPPLPSLSSILHHSPLHSELNPHLQS 
CRLPSRPSVSRELPPQSGPASSVPLAPTPLPDS 
VPSQRHPTXPPPAS 


280 


1630 


A [ 


2825 


307 


77 


PSMVWSYHWGVKQKRLALCVFSFEEGGRRK 

CGQYWPLEKDSRIRFGFLTVTNLTGAVGEPG 

VAFQCDGQRRREFTC 


OR 1 


1631 


A 


2827 


81 


381 


" KMGTAVWVPKEKEKRDKASQEGGDVLGAK 
QDCTPSLKSLVATGNLLDLEETAKAPLSTVSA 
KTTOMDEVPRPQALSGSSVVWVSGCVASRS 
VILSLTSG 


OR? 


1632 


A 


2830 


471 


160 


"lOPXDKYELEPSPLTQYILERKSPHTUWgvFV] 
TSSGKYNELGYPFGYLKASTTLTCVNLFVMP 
YNYPVLLPLLDDLFKVHKLKPNLKWRQAFDS 
YLKTLPPYYL 


283 


1633 


A 


2835 


462 


148 


VSPALSLTPTIFSYSPSPGLSPFTSSSCFSFNPiili 
MKHYLHSQACSVFNYHLSPRTFPRYPGLMVP 
PLQCQMHPEESTQFSIKLQPPPVGRKNRERVE 

SSEESAP 


284 


1634 


A 


2836 


2 


"384 


KTLPRTLLDILADG'nLKVGVGCSEJJASJU^V? 
DYGLVVRGCLDLRYLAMRQRNNLLCNGLSL 
KSLAETVLNFPLDKSLLLRCSNWDAETLTED 
QVTYAARDAQ1 S VALFLHLLGYPFSKNSPGEK 

KR 


285 


1635 


A 


2843 


20 


271 
278 


PIRPYYSYSGLDRDCSWLPLAKAWLPDVMIL 
VCDRVSEDGINRQQAQEWCIKHGFELVELSP 
EELPEEDGKCLCVRRKYG1 YI _ 
" ' TAEDVLTVAYEHGVNLFDTAEVYAAGK 


286 
287 


1636 
1637 


A 
A 


2845 
2851 


197 
2 


427 


' FVAEVRREWAKYMEVHEKASFTNSELHRAM 
NLHVGNLRLLSGPLDQVRAALPTPALSPKDK 
AVLQNLKRILAKVQEMRDQRVSLEQQLRELI 
QKDDITGSLV 1 1 DH^rvOKiU^r cca^injv i l/v 1 - 
KVYLEONLAAQDRVLCALT 


288 


1638 


A 


2859 


2 


i 469 


" FVNLGILTCIECSGlHREMGAHISRJQSLhLUK 
LGTSELLPAKNVGNNSFKD1ME.ANLPSPSPKP 
TPSSDMTVRKEYITAKYVDHRFSRKTCSTSSA 
KLNELLEAIKSRDLLALJQV Y Abu v tL/vuir la* 
EPGQELAETALHLAVRTADQTSLHLVE 


289 


1639 


A 


2861 


2 


454 

i 
i 

! 


"'"FVASGGPATARMSDSQFFCVAEERSGHCAVV 
DGNFLYVWGGYVSIEDNEVYLPNDEIWTYPI 
DSGLWRMHLMEGELPASMSGSCGACENGKL 
YIFGGYDDKGYSNRLYFVNLRTRDETY1WEK 
ITDFEGQPPTPRDKLSCWVYKDRLIYFG 


290 


1640 


A 


2868 


1 


378 


" ■frqgqlykvflhgsqgqvyhsqqvgppgsai 

SPDLLLDSSGSHLYVLTAHQVDRIPVAACPQF 
PDCASCLQAQDPLCGWCVLQGRCTRKGQCG 
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SEQID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 

[DNO: 

in 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A-Alanine C-Cystcine, 
[>Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
t— icrkipuHr*^ K=T v«sine. ls=Leucine, 

1 — ISOlCUCmc, iv i-^j a UJ*--, ^ 

M=Methionine, N-Asparagine, P=Proline, 
r\— niirt-«min^ R = A rpinine S^Senne, 
T-Threonine, V-Valine, W-Tryptophan, 
Y-Tyrosine, X^Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 














PAnm wnwi WSYEEDSHCLHIOSLLPGHHPR 
QE 


291 


1641 


A 


2870 


1 


385 


"cd v\yTP"MT>JT3 HOT 1 R KTCH1GNDIVT1 VFQbFU AJL 
PFTPKSrRSHFQHVFVIVKVHNPCTENVCYSV 
GVSRSKDVPPFGPPIPKGVTFPKSAVFRDFLL 
AKVINAENAAHKSEKFRAMATRTRQEYLKD 

LA 


292 


1642 


A 


2877 


3 


188 


dtytdddd a T^rr^QPP^N^TK^T^T^l KfCFKSAILDLYl 
PPPPAVPYSPRYVAVHCHGMLVSCWCHL 


293 


1643 


A 


2878 


1 


427 


GAGTHPDAA1PSGERTCGSEGSRSVLDLVNYF 

LSPEKLTAENRYYCESCASLQDAEKWELSQ 

GPCYULTLLRFSroLRTMRRRKILDDVSIPLL 

t r>i Tit A/VID/^A A VTlI 


294 


1644 


A 


2879 


109 


245 


QLCCFCFRQTTLIVYILSFlGMVifTFTLDLRY 1 

IIVFV 1 KjKj VlAJ 


295 


1645 


A 


2880 


3 


320 


LASSQHGILNNLSLLFSICKTCIRTMDHHCPKA 
NNCVGEQNHRFFCALHCKSKHFCIEFTLNTNF 
FNCFLPGAEKSTIDAPFSLQPFLQDSKYNTALS 

T SF.STSQ 

— KT-NT/oT tot iS'VTjf t rSTrrvxTPPTt r5T)99MP001*E 


296 


1646 


A 


2892 


209 


363 


SQYSriSLUYriLLv^V 1 JSJNrr l L/Vji-'ooi^rvjv^ 1 1- 
RLQEFSQKMDQVRGHWPVST 

— nrnfTT VI T~\T^T?TT T "T""* T rVKTTT \/T TT A I'PPFK/f A OO 


297 


1647 


A 


2893 


8 


424 


SPX I LXLD I r iLUjIQUNLL vlil»a i rrr ivirvvjvj 

KLYSTMGRFLRDRKNPACREMAWLLANLA 

QGDSLAARAIAVQKGSIGHLLGFLEDSLAAT 

QIQQSQASLLHMHhHWEPTSVDMMRRACRA 

LLALAKVDDNHSEF 


298 


1648 


A 


2894 


310 


445 


FWlYFPSFFMTGYLPLGFEFAVEITYPEStCj-i 3 
SGLLNAaAQVNL 


299 


1649 


A 


2898 


1 


492 


" KIKAKNLTNYDLCSIFLGTSTLLVWVCiViKYL 
GYFQAYNVLILTMQASLPKVLRFCACAGMIY 
LGYTFCGWIVLGPYHDKFENLNTVAECLFSL 
VNGDDMFATFAQIQQKSUVWLFSRLYLYSFI 
ot rivxin or ptat TTn^vnTTKKFOONGFPETD 

LQEF 


300 


1650 


A 


2901 


1 


445 


" PVWWNSLNGASEVTFSVHVKDGGSFFK1UST 
t\ rr\rDt7\rkJV A riPPTcT VP A KFOTFMFPENOPVS 
SLVTTITGSSLRGEPMSYYTASGNLGNTFQIDQ 
LTGQVSISQPLDFEKIQKYWWIEARDGGVPP 
FSSYEKLDITVLDVNDNAPIF 


301 


1651 


A 


2902 


162 


433 


THFICLPLGYCFPLLDKDLQLPSGFNCNFDFLE 
cDpr.wMYriHAKWl RTTWASSSSPNDRTFPG 
KP A V S EDMKELRP A C STYFNP RFP YKL 


302 


1652 


A 


2909 


2 


412 


"" npnx>n ri<rkfTVFTWVTRSOCOFEWLADIMQEV 
EENDHQDLVSVfflYVTQLAEKTOLRTTMLYI 
CERHFQKVLKR^IJT'GUlSrTHFGKPPFEPFFN 
SLQEVHPQVRKIGVFSCGPPGMTKNVEKACQ 
LVNRQDRAHFM 


303 


1653 


A 


2914 


291 


453 


" jaNR\^CrTYSWSFGILLYEMVTLGAPPYKt 
vppt^tt FHl ORRKIMKRPSSCS 


304 


1654 


A 


2926 


179 


354 


PGVPSQALRKAESLKKCLSVMEAKVKAQTAP 
NKDVQREIADLGEVGAASLPPSSGPGA 


305 




A 


2938 


135 


438 

1 


" GMGYLHAKGILHKDLKSKNVFYDNGKVVri- 
DFGLFSISGVLQAGRREDKLRIQNGWLCHLA 
PEIIRQLSPDTEEDKLPFSKHSDVFALGTIWYE 

LHAREWP 


306 


1656 


A 


2944 


2 


! 329 

i 

i 


" VRWNSCVNCSCAFGNGASLSTSLGESSCxCLW 
EIGK\V^CSLLSFPSPLAVLnTFCIVTVLGREA 
LTKGALWAVFLLAGSALLCAEVTGVIWRQPE 
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to last amino 
acid residue 
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sequence 


Amino acid sequence (A= Alanine C-Cysteine, 
D=Aspartic Acid, E=01utamic Acid, 
F=Phenylalantne, G=Glycine, H=Histidinc, 
I=Isoleucine, K==Lysine, L=Leucine, 
[^Methionine, N=Asparagme, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T-Threoninc, V-Valine, W-Tryptophan, 
Y^Tyrosine, X-Unknown, *-Stop codon, : 
/^possible nucleotide deletion, V=possible 
nucleotide insertion 
SKTKLSFKVSSSA 


307 


1657 


A 


2950 


2 


411 


N YLCIAKN S AGS AMGKTRLVVQVPP VIENGL 
PDLSTTEG SHAFLPCKARGSPEPNIT WDKDGQ 
PVSGAEGKFTIQPSGELLVKNLEGQDAGTYT 
rTAENAVGRARRRVHLTILVLPVFTTLPGDRS 
LRLGDRLWLR 


308 


1658 


A 


2951 


1 


407 


PTE PPR VRFDNEFDAESORKR'ITS VSKMERM 

DSSLPEEEEDEDKEAINGSGNAENRERHSESS 

DWMKTVPSYNQTNSSMDFRNYMMRDETLEP 

LPKNWEMAYTDTGM1YFIDHNTKTTTWLDP 

RLCKKAKAPEDC 


309 


1659 


A 


2954 


2 


179 


" nnri rr tt TFPTOI T YVGAREALFAFSMEALE 
LQGAVRGGAVGGSRACQRARPRGAVLG 


310 


1660 


A 


2959 


1 


419 


QDMMERAIIDTFVGHDWEPGSYVQMFPYPC 
YTRDDFLFV1EHMMPLCMVISWVYSVAMTIQ 
HIV AEKEHRLKEVMKTMGLNNA VH WV AWFI 
TGFVQLSISVTALTAILKYGQVLMHSHW1IW 
LFLAV i A V A i Jivir^r 


311 


1661 


A 


2963 


3 


465 


MKPQMPGLGAPNGYGPGRGRAGVPGGPbKK 

PWVPHLLPFSSPGYLGVMKAQKPGAGEGMK 

PQKPGLRGTLKPQK.SGHGHENGPWPGPCNA 

RVAPMLLPRLPTPGVPSDKEGGWGLKSQPPS 

a vrkxrnvi pr;unpp>JOYOPGAEPGFNGGLEPQ 

KI 


312 


1662 


A 


2967 


3 


405 


WLAQEWSPCTVTCGQGLRYRWLClDHXtjM 
HTGGCSPKTKPHIKEECIVPTPCYKPKEKLPV 
EAKLPWFKQAQELEEGAAVSEEPSFIPEAWS 
a ^nrrrrvHTOVP TVP TO VI ,1 SFSOSVADLPI 
DECEGPKPA 


313 


1663 


A 


2969 


2 


430 


WADNCRQGYLDALRFLERRGLTKEPVLW 1 
LVSKEPPAPADGNWDAGCDQRRKGGLSLNW 
KVPHVQVKDVPNFEQLSPELEAALKKACTRD 
dcdu/ar FWH<;riPfiOVLTYLLLPCTLPFE YTYF 
RSRRLWWLPDVPADLWWMQ 


314 


1664 


A 


2971 


422 


33 


LDXSI^ALQRLRTOWLAPLFQLRALHLDrlNJb 
LDALGRGVFVNASGLRLLDLSSNTLRALGRH 
DLDGLGALEKLLLFNNRLVHLDEHAFHGLRA 
LSHLYLGCNELASFSFDHLHGLSATHLLTLDL 


315 


1665 


A 


2973 


1 


525 


ITVSTHASGSPFGLEPQSGWLWVRAALDKfcA 

QELYILKVMAVSGSKAELGQQTGTATVRVSI 

LNQNEHSPRLSEDPTFLAVAENQPPGTSVGRV 

FATDRDSGPNGRLTYSLQQLSEDSKAFRIHPQ 

TGFVTTLOTLDREOQSSYQLLVQVQDGGSPP 

RSTTGTVHVAVLDLNDNT 


316 


1666 


A 


2978 


2 


400 


EL VVELV S AGKSGPERN 1 VEVQWTGNVPKA 
GTDANVYLTIYGEEYGDTGERPLKKSDKSNK 
FEQGQTDTFTIYAIDLGALTKJRIlvHDNTGNR 
A fi WFLDRIDITDMNNE1TYYFPCQR WL A VEE 
DDGQLSRE 


317 


1667 


A 


2981 


3 


440 


VLNCQGRPTRPVRINGDGQEVLYLAESDNVK 

LGCPYVLDPDDYGPNGLDIEWMQVNSNPAH 

HRENVFLSYQDKRINHGSLPHLQHRVRFAAS 

DPSQYDASFNTM^QVSDTATYECRVKKTTM 

ATRKVIVTVQARPAVPMCWTEGQ 


318 
319 


1668 
1669 


A 
A 


2995 
2999 


119 

2 


414 

332 


LPEKEFPimKSSSLKVTKCLFTEQPKPlilLK^A 
ENYDARLLRIDIANTLREQVQELFNKTYGKQ 
RRTPGEGHVAAVDREVAGFPVPAEGISGETIH 
GFFAYTYGRLWVEDLHSGAQQHWSUHSAbi 
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NO: of r 
nucl" 1 
eotide s 
seq- 1 
uence 


>EQ ID I 
<0: of r 
)cptidc 
,eq- 
jence 


*4et 5 
lod I 

i 
1 
( 


SEQ I 
DNO: \ 
n i 
JSSN 
)9/496 


Predicted J 
beginning i 
nucleotide 
ocation 
:orrespondi 
ng to first 
amino acid 
residue of 
peptide 
sequence 


Predicted end 
nucleotide 
ocation 
:orresponding 
to last amino 
acid residue 
of peptide 
sequence 


\rnino acid sequence (A-Alaninc OCysteme, | 

0=Aspartic Acid, E-Glutamic Acid, 

F=Phenylalaninc, OGlycine, H=Histidine, 

r t^«' Ati^ina v — I \/ c inp T =T purine 
i=Isoleucine, k.— Lysine, u lculiuc, 

M-Methionine, N=Asparagine, P=Proline, 

Q=Glutamine, R=Arginine, S= Serine, 

t t'i \/_Voh'rif* W^TrvntODhan 

I **I nreonine, v m vaiiiie, w n/^u^uwi, 

Y-Tyrosine, X-Unknown, *=Stop codon, 

/^possible nucleotide deletion, V=possible 

nucleotide insertion 

o-ri atc'uqahvi A^ASGRSSTTAHCOIRVWD 














I .LAJLbrio AV^ V LsJ\Sf**J vjjvjo i i ru ■ IV -^V A - ,X 

VSGGLCQHUFPHSTTVLALAFSPDDRLLVTL 
GDHDGRTLALWGTGHL 


320 


1670 


A 


3000 


693 


322 


IDESTGL1ITVNYLD YETKTS YMMNVS Al DQA 
T^rttrxir^trr'^VVTn T "NTFT DFAVOFSNASYEAA 
ILENLALGTEIVRVQAYSIDNLNQITYRFDAY 
TSTOAKALFKIDAITVRGWGOGAPFFPI 

. pvT\n rpoTr.FT FfrFA^^RT PPOPC 


321 


1671 


A 


3001 


6 


383 


RIPRGKACXTV LOKo 1 UfcLcur Aaor\-L»rr v r ^ 
GWGQSSDLLSRIDLDELMKKDEPPLDFPDTLE 
GFEYAFNEKGQLRH1KTGEPFVFNYREHLHR 
WNOKRYr.ALCjbll In. x v i clxcisj^^i>*oivjv * ^ 


322 


1672 


A 


"3007 


192 


"447 


ERVRNSLKPGRGDSQCACCPSSPVWVFLKTm- 
LFPWLFLQVEVIKKAYMQGEVEFEDGENGK 
DG AASPRNV GHN 1 Y \XJ\nUL, Aiui 




1673 


A ! 


3019 


18 


245 


KELLF YHLIVNNINFFNTRY AKIHIPIIAS V 5btl 

^r^unrpccrnT tttt \/rTPPAfiT \irFprKTsnND 
QFTTWVSFFFDLHlLVClrrAOi- wrv^iivi^ii^^ 

ERVFGKRGF 

— - ■■ ,^.^» x r.tiTTonuirr'n vrcTl C A ITJ li K FPrJl rl 


324 


1674 


A 


3020 


523 


797 


LCYFSARYHQRKJxCjLL Y Ir l Lb/urNisJSXiriNi>ri 
YLFIFFEMESHSVTHAGVQRHNLNSLQPLPPG 
FKRFSCLCFLSSWNYRGAPPGPANF 


325 


1675 


A 


3022 


2 


156 


NDFLPLYFGWVLTKKSSbTLRKAGQVFLbbX 
GNHKAFKKELRQCRWQVGAL 


326 


1676 


A 


3023 


38 


172 


KMVRGSKKLISFFFOUr Y uiLAUKUrirvUi^/v i 
FCLNKEALKDEFE 


327 


1677 


A 


3027 


1 


385 


LTLEFLLLPAASELAHGKRLACCIVDHKLPtvJ 
GFYGLYDKULLFKKDPTSANLLQLVRSSGD1Q 
EGDLVEVVLSASATFBDFQLRPHALTVHSYRA 
PAFCDHCGEMLFGLVRQGLKCDGCGLNYHK 


328 


1678 


A 


3030 


13 


569 


ITRPTISCQRPGPGLAAUMLFY i viNris.vorvivi 
LTGALNAHNKAAVDWGWQGLIAYGCHSLV 

vvidsitaqtlqvlexhkadvvkvkwaren 
yhhnigspyclrlasadvngkiivwdvaagv 
a(x:eiqehakpiqdvqwlwnqdasri)lllai 
hppny1vlwnadtgtklwkksyadnilsfsf 

T> 


j Ay 


1679 


A 


3038 


90 


744 


SVNLPPSLWPWEEAMDSTKSEPLKCiSPbAiiU 
GNIEYKKL VNPSQYKrbHLV ii^jvijvvvm- > v e ' vj 

rgeavyq1gvedngllvglaeeemraslktl 
hrmaekvgaditvlrerevdydsdmprkite 
vlvrkvpdnqqfldlrvavlgnvdsgkstl 
lgvltqgeldngrgrarjlnlfrhlheiqsgr 

-rnncrcTi r irviQk^ op vwrTrMnTO WGOTLRMG 

w 


330 


1680 


1 A 


3040 


3 


397 


' LCSTLLLLTIPS\m.SQITLKESGPTLMKKlbl 

ltxtctfsgfslntsgvgvawirqppgkale 

wlaliywdddkryspslndrltiakdtsrkq 

wltmtnmgpvdtatyycaqfargargsn 


331 


1681 


A 


3043 


3 


1509 


~ AGIRHEAPPTTSNRHRRQIDRGVTHLNIbULK 
MPRG1AIDWVAGNVYWTDSGRDV1EYAQMK 
GENRKTLISGMTOEPHAIVVDPLRGTMYWSD 
WGNHPKIETAAMDGTLRETLVQDNIQWPTG 
LAVDYHNERLYWADAKLSVIGSIRLNGTDPI 
V AADSKRGLSHPFSIDVFEDY1YGVTYINNRV 
FWHKFGHSPLVNLTGGLSHASDVVLYHQHK 
QPEVTNPCDRKKCEWLCLLSPSGPVCTCPNG 
KJILDNGTCVPVPSPTPPPDAPRPGTCNLQCFN 
GGSCFLNARRQPKCRCQPRYTGDKCELDQC 
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SEQID J 
NO: of 1 
nucl- l 
eotide 
seq- 
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5EQID 1 
^0: of ! 
peptide 
seq- 
jence 


Viet J 
jod 1 


SEQ 1 
DNO: 1 
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USSN 
09/496 
914 


Predicted 
beginning 
lucleotide 
ocation 
correspond i 
ng to first 
amino acid 
residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, 
0=Aspartic Acid, E-Glutamic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
I— lsoieucine, iv^i^y juic, l. iau^ihv, 
M=Methionine, N-Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=V aline, W=Tryptophan, 
Y-Tyrosine, X-=Unknown, *=Stop codon, 
/^possible nucleotide deletion, V=possible 
nucleotide insertion 

-\\rcwr*x»jnnTC A A qpqriMPTCRCPTGFTGPKC 














TQQ VC AG YC ANN STCTVNQGNQPQCRCLPG 

FLGDRCQYRQCSGYCENFGTCQMAADGSRQ 

CRCTAYFEGSRCEVNKCSRCLEGACWNKQS 

GDVTCNCTDGRVAPSCLTCVGHCSNGGSCT 

MNSKMMPECQCPPHMTGPRCEEHVFSQQQP 

ftUTASTT .TP 


332 


1682 


A 


3045 


3 


952 


TTT1SNFHTQVNRTYCCGTYRAGPMRQISLVG 
AVDEEVGDYFPEFLDMLEESPFLKMTLPWGT 
LSSLRLQCRSQSDDGPIMWVRPGEQMIPTAD 
MPKSPFKRRRSMNEIKNLQYLPRTSEPREVLF 
EDRTRAHADH VGQGr D W v^b 1 AA v <j v v 
QFGE W SDQPR1TKD V1CFHAEDFTD WQRLQ 
LDLHEPP V SQC VQ WVDE AKLNQMRREG IRY 
ARIQLCDNDIYFIPRNVIHQFKTVSAVCSLAW 
H1RI-KQYHPVVEA I 1 iioNoiNmLf^\ji-. i vjr%j\ 
ELEVDSQCVRIKTESEEACTE1QLLTTASSSFP 

PASE 

— 7T\ ^rMy/>nrT -non aTDCI TP P AXIO^r^PnTrDRT , 


333 


1683 


A 


3046 


497 


167 


SACSTGPELPGKA 1 K£>L 1 ktaj> ki^kj^uwiu^ 
YYDGCAMIAMNGSVFAQGSQFSLDDVEVLT 
ATLDLEDVRSYRAEISSRNLAVSAPVDTCVG 
CSSKTWKVAPFVRAWWRP 


334 


1684 


A 


3053 


37 


276 


VqTDLEEQLNQLTEDNAKLNNQNFYLSKQLJJ 
EASGANDEIVQLRSEVDHLRREITEREMQLTS 
OKOVRRVNKVVRSLEDF 


335 


1685 


A 


3054 


2 


846 


WDAWGDWSDCSRTCGGGASYSLRRCL'I UR 

NCEGQNIRYKTCSNHDCPPDAEDFRAQQCSA 

YNDVQYQGHYYEWLPRYNDPAAPCALKCH 

AQGQNLWELAPKVLUO 1 KCin i uoLumuau 

ICQAVGCDRQLGSNAKEDNCGVCAGDGSTC 

RLVRGQSKSHVSPEKREENVIAVPLGSRSVRI 

TVKGPAFILFrESKTLQGSKGEHSFNSPGVFW 

ENTTVEFQRCj obKl^ i r ivirurL < ivi^j_/r iriviiM 
TAAKDSWQFFFYQPISHQWRQTDFFPCTVT 

CGGG 


336 


1686 


A 


3058 


54 


347 


" VVGKQEAGAHSDSCCLLHTPPRLTPAHSKKA 
LRNSRIVSQKDDVHVCIMCLRAIMNYQVSRG 
AWDWRLGSPACPHWGLHKLPRLWDPLSLYP 

VT CWGT 


337 


1687 


A 


3059 


2 


709 


ILTSLVELTRFETLTPRFSATVPPCWVEVWb 

QQQRRHPQHLHQv^rUHUUAAvri 1 K 1 w IVL -' V < 1 
DSNSWDEHVFELVLPKACMVGHVDFKFVLN 

SNTTNIPQIQVTLLKNKAPGLGKVNGLRLCPF 
LEDHKED1LCGPVWLASGLDLSGHAGMLTLT 
ot>v*t \rv r^\A A nnv YR 'sFT 1HVKAVNERGTEEI 
CNGGMRPWRLPSLKHQSNKGYSLASLLAK 
v a a r,v VK Q WVKNENTSGTRK 


338 


1688 


A 


3060 


85 


384 


" KAFYNYHVLELLQMLVTGGVSSQLEQHLDK 
DKVYGVADSCTSLLSGRNRCKLGLLSLHETIL 
SDVNPRNTFGQLFCGSLDLFG1LCVGLYRIIDE 
FFT*TP 


339 


1689 


A 


3063 


236 


362 


CFLCLSGDFMVMTIFFN VSRRFGY Y AFQN Y V 
PSSVTTMLSWV 


340 


1690 


A 


3065 


3 


1249 


" DLWQFTPLHEAASKNRVEVCSLLLSYGAJJl'l 
LLNCHNKS AIDL APTPQLKERL A Y EFKGHSLL 
QAAREADVTRIKXHLSLEhfVNFKHPQTHETA 
LHCAAASPYPKRKQICELLLRKGANINEKTKE 
FLTPLHV A SEKAHND VVE VVVKHE AK VN AL 
DNLGQTSLHRAAYCGHLQTCRLLLSYGCDPN 
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nucl- 
eotide 
seq- 
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SEQID 
NO: of 
peptide 
seq- 
uence 


Met 
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SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 
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nucleotide 
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correspond i 
ng to first 
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IISLQGFTALQMGNENVQQLLQEGlSLCiN SEA 
DRQLLE AAKAGDVETVKKLC 1 a vrs tRUiL 
GRQSTPLHFAAGYNRVSWEYLLQHGADVH 
AKDKGGLVPLHNACSYGHYEVAELLVKHGA 
VWVADLWKFTPLHEAAAKGKYEICKLLLQ 
HGADPTKKNRDGNTPLDLVKDGDTD1QDLLR 
GD AALLD AAKKGCL AR VKKLS SPDNVNCRD 
TQGRHSTPLHLAGK 


341 


1691 


A 


3070 


1 


547 


G VLIP SFQNQLF ADILAGIES VTSEHNYQ1 L I A 

NYNYDRDSEEESVINLLSYNIDGIILSEKYHTI 

RTVKFLRSATIPWELMDVQGERLDMEVGFD 

NRQAAFDMVCTMLEKRVRHKILYLGSKDDT 

RDEQRYQGYOTAMMLHNLSPLRMNPRAISSI 

HLRMOLMRDALSANPDLDGVFCTN 


342 


1692 


A 


3073 


463 


3 


RINRCRKPSDADILVPGDTISLIGTTSLRIDYNE 
IDDNRVTAEE\T)ILLREGEKXAPVMAKTR1LK 
AYSGVRPLVASDDDPSGRNVSRGIVLLDHAE 
RDGLDGFH1TGGKLMTYRLMAEWATDAVL 
RKLGNTRPCTT ADL ALPG SQEPAK VP 


343 


1693 


A 


3075 


250 


1 


LLIYLAIFAPVAMSALAGVKSVQQVRIRAAQS 
LGASRAQVLWFVILPGALPEILTGLRIGLGVG 
WSTLVAAELIAATRGLGFM 


344 


1694 


A 


3076 


2 


138 


LYFDAYTQSLQVAAISTFCCLLIGYPLAWAV 
AHSKPSTRNILLLL 


345 


1695 


A 


3078 


469 


3 


LKIRG QRIELGEIDR VMQ ALPD VEQ A VTHAC 

VINQAAATGGDARQLVGYLVSQSGLPLDTSA 

LQAQLRETLPPHMVPWLLQLPQLPL1ANGKL 

DRJCALPLPELKAQAPGRAPKAGSETI1AAAFS 

SLLGCDVQDADADFFALGGHSLLAMKLAT 


346 


1696 


A 


3082 


404 


2 


QN1TSKDLDVRLDPQTVPIELEQLVLSFNHM1 
ERIEDVFTRQSNFSADIAHEIRTPITNLITQTEI 
ALSQSRSQKELEDVLYSNLbEL 1 KMAKJVi v 
MLFLAQADNNQL1PEKKMLNLAHEVGKVFD 

QFEALPE 


347 


1697 


A 


3084 


3 


340 


NELTFKEAEISKLYTKVHPAYRTLLEKRQALb 
DEKAKLNGRVTAMPKTQQEIYRLTRDVESGQ 
QVYMQLLNKEQELKJTEASTVGDVRJVDPAIT 
OPGVLKPKKGLIILGA1 


348 


1698 


A 


3086 


723 


10 


TQAM VWQQKAC AEDDPQLSGRH WLHAA 1 L 
YNIAA YPHLKGDDL AEQ AQAL SNRA YEE AA 
QRLPGTMRQMEFTVPGGAPITOFLHMPKGDG 
PFPTVLMCGGLDAMQTDYYSLYERYFAPRGI 
AMLTIDMPS VGFSSKWKL 1 <^DbbL.LHi^l-i v ljv 
ALPNW WVDHTRVAAFGFRFGANV AVRLAY 
LESPRLKAVACLGPWHTLLSGLKCQQQVPE 
MYLD VL ASRLGMHD ASTKS STRENH 


349 


1699 


A 


3087 


2 


249 


RTRSSDPEjTLAGTPLHAAYLIGMTLICAGFSV 
GFGVAMSQALCjrr bLKAuV Aoo i luia^vlu 

sslwiwlaawgigawnm 


350 


1700 


A 


3099 


3 


424 


eapeatpqpsqpgpsspislsaeeenaegevsr 

ANTPDSD1TEKTEDSSVPETPDNERKASISYFK 
NQRGIQYTDLSSDSEDWSPNCSNTVQEKTFN 

kdtviivsepsedeesqglptt^arrnddisele 
dlsgmedlk 


351 


1701 


A 


3108 


2 


404 


" jkknhiigyqllhrralfekrtrlsdyalifg 
mfgiwmvietelswgayykaplyslalkcl 

ISLFTHLLGLTIVYli^RElQLFMANYGADDWR 
SALTYEPIFLILLEAXRGV1HATPCRVSLSLWD 
GLDLP 
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352 


1702 


A 


3110 


341 


2 


AQLAEVCPPQTLLTTNTSS1S1TAIAAE11CNPEK 
v a ni WFFMPAPVMKLVEVVSGLATAAEVVE 
QLCELTLSWGKQPVRCHSTPGHVNRVARPY 
Y SE A WRALEEQ V AAPEVI 


353 


1703 


A 


3111 


3 


188 


HFSLFR1AFAVFLTYMTVGLPLPVIPLFVHHHL 
GYGNTMVG1AVGIQFLATVLTRGYAGRLA 


354 


1704 


A 


3116 


367 


225 


WQLFHLNGTFLNIGETDTESCVNGWVYDRSb 
FPFSNMTEVRGLVFLS 


355 


1705 


A 


3117 


101 


53 


VIHL VYLI SSPRPELKP VDKESEVVMKFPDCiF 
EKJFSPPlLQLDbVUr Y YUrKrivirois^o v^/vi^i- 
ESRICVVGENGAGKSTMLKLLLGDLVAPVRGI 
RHAHRNLKJGYFSQHHVGAAGT*TFSACGNL 
LGTQVFLGRPEEEYXRHQLGFGMGISGELGHA 

S SLP ACLO GQIvkAfc Y Ar i^oiajljut v^rj^rni^i 
DEFTN\HLGHGRA1EALGPCLQTISGVGVILVS 
HE* SALSRLVCRE\LWVC*GRSTSPF 


356 


1706 


A 


3121 


137 


466 


RGGRDWGEHNQRLEEHQARAWQGAMDAO 
AASREHARWQGTGLAPGTRVAVAPTCVQGL 
PQERSVCRPFFSSRWREGPVWALGAGAHGKP 
RWSGGVRCWRGGRWFTPAPH 


357 


1707 


A 


3124 


1249 


229 


MLE APGPSDGCELSNPS ASRVSC AGQMLt v g 

PGLYFGGAAAVAEPDHLREAG1TAVLTVDSE 

EPSFKAGPGVEDLWRLFVPALDKPETDLLSH 

LDRCVAF1GQARAEGRAVLVHCHAGVSRSV 

Al ITAFLMKTDQLPFEKAYEKLQILiu*n/USJViJN 

EGFEWQLKLYQAMGYEVDTSSAIYKQYRLQ 

KVTEKYPELQNLPQELFAVDPTTVSQGLKDE 

VLYK.CRKCRRSLFRSSSILDHREGSGPIAFAH 

KRMTPSSMLTTGRQAQCTSYFIEPVQWMESA 

LLGVMDGQLLCPKCSAKLGSFNWYGEQCSC 

GR WITP AFQIHKNRVDEMKILP VLG SQTGKI 


358 


1 1708 


A 


3127 


816 


139 


" EVETLGPRTPGP/EAQSPTPGSCPG WQEPSFUi- 
TPPP*LSGPGPQGAPVLUlU^LrlJrJbJsi r aojw r 
LGIGiFWWGL\PVTSANFSPGAAA*FGGALSPP 
GGDL/GHMLLQGPPSPFRLQQQ* QTPPGSHSP 

TV- A vmPTvroPD AAA ATYTT3 WOHKR S WRGW 

PTANREENPGr AAAALJ l KoL, w unrvivo yv ivvj 

RGLAPWRLGFGSPGIP*PAPAGIP/GRPTWEGG 

KGAGGKPSETLTRSPPVWRGICRGSANGFLSW 

VQTT Q 


359 


1709 


A 


3132 


3 


191 


HHHLLLLL.LL- Vr.L v rvov^vj v inl/i^x^i^vji a 
HRPLDKKREDAPNLRPALADUTVCDYRAQIA 
* AASTPKRAASIAFINAVSCR* AQIA 


360 


1710 


A 


3134 


1 


286 


" REPPRPALLFF^DRVSLCCPGWNAWQSQLl 
AAPTSQVQ/SDSPTFPSSWDYRJiVTEYPANFL 
*RQGFPMLPRLVSNSWAQTVHPPRPPKVLDL 

OA 


361 


1711 


A 


3135 


56 


1449 


" PVPAPRVSPSARGAPGRPRLPGVRGPRHS/WA 
AD^RGSRM/PPRAPAPSPTGP/APGGKKVRGR 
VPEDPDAYEPRCSAL*V*PTHVTSPQFCDP*K 
GQmSYTTVXLRGLNETMLVK7PLCRREP/PEA 
GPGRQSTPAVTRDHRQHEDPRGAGRQWDAD 
PRPSAP/PAEVATGSRPGRHMVvWLCLAAQQ 
APGLPHRTS IRPG WRRLTEPE A W ARRHRRP W 
GQRGAVRPPPQGAAPPPSHQGRRTNTDPSAT 
PRLTVMSRCLAPDLKAPASGPRGWRRGMPQ 
SS/GALLWTPPPTPRGSRSPRPREAPLRAIHPA 
GPSK/SRAGASGRLPEVIYGWVTLFTPPEAGT 
F/LIPSPT*MSPALVIQPPVPPTQMGLRISGLPR 
QG*PSG AP W* LPGL AQLAFQCHLPFTDEVGPP 
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RNQSPLGND'I*LSSGLPMGPRRQ VWPLAK V U 
GHSSPREPQVLKKPLWGQTDIAGVGSASLYP 
rasn. 


362 


1712 


A 


3136 


1270 


274 


RVGMVLGTREVGDSTPPPSPPLYPFTGNth vg 

HNTWQLSRVYPSDLRTDSSNYNPQELWNAG 

CQM/V*GGSRDWEEGVEEQQVGNKFSSDGR 

VGECSRKLLG*EMLSVDITSRYRAPSTYLLNS 

t ircm phi wrrFSrsSFLLGPSVAMNMQTAGL 

EMDICDGHFRQNGGCGYVLKPDFLRDtQSSF 

WPF^PTSPFKAOTLLNOVISVOQLPKVDKTKE 

GSIVDPLVKVQIFGVRLDTARQETNYVENNG 

FNPYWGQTLCFRVLGPDFPMLRFGKMDYDW 

KSRNDLLGKTPCPGTCMQQGYRHIHLLSKDG 

tot ppa^TFVYTCIOEGLEGDES 


363 


1713 


C 


3139 


60 


248 


MFAGSYGKSMFSFSKKVLNCLPKWRYH^VIA 
PAMNESPLAPHLHQHLVFSVFQVLTILIGV** 


364 


1714 


A 


3140 


57 


418 


" SAFKTLQLPAFSLYFDLGSLKLLILRIHI^IVK 
xtu v \/p «3pp tm <;po* DPOSFLOIPOPRPPOLRV 
GLTSGLIQHFHSPSSCQFPLLRGPPFPRQPPLGI 
SGASLCPVLSPPR*PLQPSSL 
i t dvpqi pvft POrHFVTVRLECNGWSAHCN 


365 


1715 


A 


3145 


122 


413 


LHLPGSSDSPASAS*VAGTTGVCHHTRLIF\VF 
L V*TGFHY V AQAGLELLTA* SNPPQLPK WGL 


366 


1716 


A 


3150 


247 


2 


' VGEKLHDIRFGNDFDIvriTKAQATKEKlDKLN 
FIK1KKLC1EGYY/KREPQNGRKJFANYVS\DK 
GLMATIYEELLKLSNKLIQ 


367 


1717 


A 


3152 


3 


2367 


~ QKL KQN QPKRAH V EDGG SRSKQGNECj SKK i 
PIEKSDFAAATHPRAFYLSKPDETPNAWMSD 
SGTGLTYWKLEEKDMHHSLPETLEKTFISLSS 
TDVSPNQVLTLDPTLHMKPKQQISG1QPHGLP 
NALDDR1SFSPDSVLEPSMSSPSDIDSFSQASN 
VTSQLPGFPKYPSHTKASPVDSWKNQTFQNE 
SRTS STFPS VYTITSNDI S VNT VDEENTVM V AS 
ASVSQSQLPGTANSVPECISLTSLEDPVTLSKIR 
QKLKEKH ARHI ADLRA YYESEINSLKQKLE A 
KEISGVEDWK1TNQILVDRCGQLDSALHEATS 
RWTLENKNNLLEIEVNDLRERFSAASSASKI 
LQERIEEMRTSSKEKDNTIIRLKSRLQDLEEAF 
ENAYKLSDDKEAQLKQENKMFQDLLGEYES 
LGK£HRRVKI)ALNTTENKLLDAYTQISDLKR 
MISrU,EAQVKQVEHENMLSLRHNSRIHVRPS 
RANTLATSDVSRRKWLIPGAEYSIFTGQPLDT 
QDSNVDNQLEETCSLGHRSPLEKDSSP/GSSST 
SLLIKKQRETSDTPIMRALKELDEGKIFKNWG 
TOTFKEDTSNSLL*/INPRQTETSVNASRSPEK 
CAQQRQKRLNSASQRSSSLPPSNRKSSTPTKR 
EIMLTPVTVAYSPKRSPKENLSPGFSHLLSKN 
ESSPIREKTYSEKATDNHVNHSSCPEPVFNGV 
KKVSWTAWEKNKSVSYEQCKPVSVTPQGN 
DFEYTAKIRTLAETERFFDELTKEKDQIEAAL 
SRMPSPGGRITLQTRLNQVKCLSLNLL 


ICQ 

Job 


1 7 1 0 
I / I 0 


A 
r\ 


3163 


2 


2350 


EFKSGGCGAGL VAAGA VLVLYPASRAUtK I 

RVPGSPAPSSLPLHSPGACGTEVDMDPQRbPL 

LEVKGNIELKRPLIKAPSQLPLSGSRLKRRPDQ 

MEDGLEPEKKRTRGLGATTKJTTSHPRVPSLT 

TVPQTQGQTTAQKVSKXTGPRCSTAIATGLK 

NQKPVPAVTVQKSGTSGVPPMAGGKKPSKRP 

A\\DLKGQLCDLNAELKRCRERTQTLDQENQ 
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QLQDQLRDAQQQVKALGTERTTLEUHLAKV 

QAQAEQGQQELKNLRACVLELEERLSTQEGL 

VQELQKKQVELQEERRGLMSQLEEKERRLQT 

SEAALSSSQAEVASLRQETVAQAALLTEREER 

LHGLEMERRRLHNQLQELKGNIRVFCRVRPV 

LPGEPTPPPGLLLFPSGPGGPSDPPTRLSLSRSD 

ERRGTLSG APAPPTRHDFSFDR VFPPG S GQDE 

VFEEIAMLVQSALDGYPVCIFAYGQTGSGKTF 

TMEGGPGGDPQLEGLIPRALRHLFSVAQELSG 

QGWTYSFVASYVEIYNETVRDLLATGTRKGQ 

GGECEIRRAGPGSEELTVTNARYVPVSCEKEV 

DALLHLARQNRAVARTAQNERSSRSHSVFQL 

QISGEHSSRGLQCGAPLSLVDLAGSERLDPGL 

ALGPGERERLRETQAINSSLSTLGLVIMALSN 

KESHWYRNSKLTYLLQNSLGGSAKMLMFV 

Nl SPLEENVSESLNSLRF ASKVEPS VLFGT AQS 

NRKWKTDPDLCVCVCVCVCVCVCVCVCVP 

MSMYRVRGGRVAGGCFIGWRAPCPRAIK 


JO? 


1 719 


A 


3165 


365 


12 


GYTSQGRWIDIERGPLTANTESLHENNFNALP 
GYTRJKJE*l*rYKKN*rWGGVGLLNIVKISILS/K 
IYRFDAIPVKILTRFFINLDKLILKFVLKTKIAK 
NRJKTFYTMRRKKLGDSS 


370 


1720 


A . 


3170 


393 


42 


GASISPSAVIDGVEGLKPMQEQEAQEAGPCLD 
*HMAPEQWVAPR\RLLFRLIFSVLHALnAAAA 
QSSAEEDEDPRN*GQSSEDQAPNQNGLIVIVH 
RVHVPLGAAATVPVHRSHFPR 


371 


1721 


A 


3173 


770 


510 


GNGGCGLSQIPPSHLGAFSRGSLLSRGXDPRGP 

PPHPVIFFVFWE\QGFTVLARMVS1S*PCDPP 

ALASOSAGITGVSHLARPQNLYF 


372 


1722 


A 


3180 


381 


76 


RVLHHDNVPAHSSPQKREISQEFQLEIRHLP^S 
PDLAPSGCFLFLNLKNIFK\GTHFSLVDNYKK 
TVSTWLH/SQNAQFYKDRLNGWYHCLQKCL 
QHY*AYVEK 


373 


1723 


A 


3181 


410 


14101 


RREVAGPEGKGLLLASAHTMLTPPLLLLLPLL 

SALVAAAIDAPKTCSPKQFACRDQITCISKGW 

RCDGERDCPDGSDEAPEICPQSKAQRCQPNE 

HNCLGTELCVPMSRLCNGVQDCMDGSDEGP 

HCRELQGNCSRLGCQHHCVPTLDGPTCYCNS 

SFQLQADGKTCKDFDECSVYGTCSQLCTNTD 

GSFICGCVEGYLLQPDNRSCKAKNEPVDRPP 

VLLIANSQNILATYLSGAQVSTITPTSTRQTTA 

MDFSYANETVCWVHVGDSAAQTQLKCARM 

PGLKGFVDEHTIN1SLSLHHVEQMAIDWLTGN 

FYFVDDIDDR1FVCNRNGDTCVTLLDLELYNP 

KG1ALDPAMGKVFFTDYGQIPKVERCDMDG 

QNRTKLVDSKIVFPHGITLDLVSRLVYWADA 

YLDYIEWDYEGKGRQTIIQGILIEHLYGLTVF 

ENYLYATNSDNANAQQKTSVIRVNRFNSTEY 

QWTRVDKGGALHIYHQRRQPRVRSHACEN 

DQYGKPGGCSDICLLANSHKARTCRCRSGFS 

LGSDGKSCKKPEHELFLVYGKGRPGIIRGMD 

MGAKVPDEHMIPIENLMNPRALDFrlAETGFI 

YF ADTTS YLIGRQKJDGTERETILKDGIHNVE 

G V A VD WMGDNL YWTDDGPKKT1 S V ARLEK 

AAQTRKTLIEGKMTHPRAIVYDPLNGWMYW 

TDWEEDPKDSRRGRLERAWMDGSHRDIFVT 

SKTVLWPNGLSLDPAGRLYWVDAFYDRIETI 

LLNGTDRKIVYEGPELNHAFGLCHHGNYLFW 

TE YRS GS VYRLERG VGG APPTVTLLRSE\RPPI 
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- 




i 

J - 


FEIR\MYDAQHQQVGSNKCRVNNAGCSSLCL 

ATPGSRQCACAEDQVLDADGVTCLANPSYVP 

PPQCQPGEFACANSRCIQERWKCDGDNDCLD 

N SDE AP ALCHQHTCP SDRFKCENNRCIPNRW 

LCDGDNDCGNSEDESNATCSARTCPPNQFSC 

ASGRCIPISWTCDLDDDCGDRSDESASCAYPT 

CFPLTQFTCNNGRCININWRCDNDNDCGDNS 

DEAGCSHSCSSTQFKCNSGRCIPEHWTCDGD 

NDCGDYSDETHANCTNQATRPPGGCHTDEF 

QCRLDGLCIPLRWRCDGDTDCMDSSDEKSCE 

GVTHVCDPSVKJFGCKDSARC1SKAWVCDGD 

NDCEDNSDEENCESLACRPPSHPCANNTSVC 

LPPDKLCDGNDDCGDGSDEGELCDQCSLNN 

GGCSHNCSVAPGEGIVCSCPLGMELGPDNHT 

CQ1QSYCAKHLKCSQKCDQNKFSVKCSCYEG 

WVLEPDGESCRSLDPFKmiFSNRHEIRRlDLH 

KGDYSVLVPGLRNT1ALDFHLSQSALYWTDV 

VEDKIYRGKLLDNGALTSFEW1QYGLATPEG 

LAVDW1AGN1YWVESNLDQIEVAKLDGTLRT 

TLLAGDIEHPRAIALDPRDGILFWTDWDASLP 

RIEAASMSGAGRRTVHRETGSGGWPNGLTV 

DYLEKR1LWIDARSDAIYSARYDGSGHMEVL 

RGHEFLSHPFAVTLYGGEVYWTDWRTNTLA 

KAKKWTGHNVTVVQRTNTQPFDLQVYHPSR 

QPMAPNPCEANGGQGPCSHLCLINYNRTVSC 

ACPH3-MKLHKDNTTCYEFKKFLLYARQMEIR 

GVDLDAPYYNYDSFTVPDIDNVTVLDYDARE 

QRVYWSDVRTQADCRAFINGTGVETVVSADL 

PNAHGLAVDWVSRNLFWTSYDTNKKQINVA 

RLDGSFKNAWQGLEQPHGLWHPLRGKLY 

WTDGDNI SMANMDG SNRTLLFSGQKGP VGL 

AIDFPESKJLYW1 SSGNHTTNRCNLDGSGLE VID 

AMRSQLGKATALALMGDKLWWADQVSEKM 

GTCSKADGSGSVVLRNSTTLVMHMKVYDES1 

QLDHKGTNPCSVNNGDCSQLCLPTSETTRSC 

MCTAGYSLRSGQQACEGVGSFLLYSVHEGIR 

GIPLDPNDKSDALVPVSGTSLAVGIDFHAEND 

TIYWVDMGLSTISRAKRDQTWREDVVTNGIG 

RVEGIAVDWIAGNIYWTDQGFDV1EVARLNG 

SFRYWISQGLDKPRAITVHPEKGYLFWTEW 

GQYPRIERSRLDGTERWLVNVSISWPNGISV 

DYQDGKLYWCDARTDKJERIDLETGENREW 

LSSNNMDMFSVSWEDFIYWSDRTHANGSIK 

RGSKDNATDSVPLRTGIGVQLKDIKVFNRDR 

QKGTNVCAVANGGCQQLCLYRGRGQRACA 

CAHGMLAEDGASCREYAGYLLYSERTILKSI 

HLSDERNLNAPVQPFEDPEHMKNVIALAFDY 

RAGTSPGTPNRlFFSDIHFGNlQQINDDGSRRrr 

IVENVGSVEGLAYHRGWDTLYWTSYTTSTJT 

RHTVDQTRPGAFERETVTTMSGDDHPRAFVL 

DECQNXMFWTNWNEQHPSIMRAALSGANVL 

TLrEKDIRTPNGLMDHRAEKXYFSDATLDKIE 

RCEYDGSHRYVILKSEPVHPFGLAVYGEHIF 

WTDWVRRAVQRANKHVGSNMKLLRVDIPQ 

QPMGIIAVANDTNSCELSPCRINNGGCQDLCL 

LTHQGHVNC SCRGGRILQDDLTCRA VN S SCR 

AQDEFECANGECINFSLTCDGVPHCKDKSDE 

KPSYCNSRRCKKTFRQCSNGRCVSNMLWCN 

GADDCGDGSDEIPCNKTACGVGEFRCRDGTC 

1GNSSRCNQFVDCEDASDEMNCSATDCSSYF 
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jence 


viet 5 
u>d I 

i 
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Amino acid sequence (A-Alanine OCysteine, 
[>Aspartic Acid, E=Glutamic Acid, 
F^Phenylalaninc, G=Glycme, H=Histidinc, 
[=Isoleucine, iv-Lysine, l^cu^uic, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R^Arginine, S=Serine, 
T=Threonine, V=Valine, W-Tryptophan, 
Y=Tyrosine, X- Unknown, *=Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 
















RLGVKGVLFQPCERTSLCVAPSWVCDUANU 

CGDYSDERDCPGVKRPRCPLNYFACPSGRCIP 

MSWTCDKEDDCEHGEDETHCNKFCSEAQFE 

CQNHRC1SKQWLCDGSDDCGDGSDEAAHCE 

GKTCGPSSFSCPGTHVCVPERWLCDGDKDCA 

DGADESIAAGCLYNSTCDDREFMCQNRQCIP 

KHFVCDHDRDCADGSDESPECEYPTCGPSEF 

RCANGRCLSSRQWECDGENDCHDQSDEAPK 

NPHCTSPEHKCNASSQFLCSSGRCVAEALLCN 

GQDDCGDSSDERGCHINECLSRKLSGCSQDC 

EDLKIGFKCRCRPGFRLKDDGRTCADVDECS 

TTFPCSQRCINTHGSYKCLCVEGYAPRGGDP 

HSCKAVTDEEPFLIFANRYYLRKLNLDGSNY 

TLLKQGLNNAVALDFDYREQMIYWTDVTTQ 

GSM1RRMHLNGSNVQVLHRTGLSNPDGLAV 

DWVGGNLYWCDKGRDTIEVSKLNGAYRTVL 

VSSGLREPRALWDVQKGYLYWTDWGDHSL 

1GR1GMDGSSRSVIVDTKITWPNGLTLDYVTE 

RIYWADAREDYIEFASLIX5SKRHVVLSQDIPH 

IFALTLFEDYVYWTDV^CTKSINRAHKTTGTN 

KTLLISTLHRPMDLHVFHALRQPDVPNHPCK 

VNNGGCSNLCLLSPGGGHKCACPTNFYLGSD 

GRTCVSNCTASQFVCKNDKCIPFWWKCDTE 

DDCGDHSDEPPDCPEFKCRPGQFQCSTGICTN 

PAFICDGDNDCQDNSDEANCDIHVCLPSQFK 

CTNTNRCIPGIFRCNGQDNCGDGEDERDCPE 

VTCAPNQFQCSITKJRCITRVWVCDRDNDCVD 

GSDEPANCTQMTCGVDEFRCKDSGRCIPARW 

KCDGEDDCGDGSDEPKEECDERTCEPYQFRC 

KNNRCVPGRWQCDYDNDCGDNSDEESCTPR 

PCSESEFSCANGRCIAGRWKCDGDHDCADGS 

DEKDCTPRCDMDQFQCKSGHCIPLRWRCDA 

DADCMDGSDEEACGTGVRTCPLDEFQCNNT 

LCKPLAWKCDGEDDCGDNSDENPEECARPV 

CPPNRPFRCKNDRVCLWIGRQCDGTDNCGD 

GTDEEDCEPPTAHTTHCKDKKEFLCRNQRCL 

SSSLRCNMFDDCGDGSDEEDCSIDPKLTSCAT 

NASICGDEARCVRTEKAAYCACRSGFHTVPG 

QPGCQDrNECLRFGTCSQLCNNTKGGHLCSC 

ARKFMKTHNTCKAEGSEYQVLYIADDNEIRS 

LFPGHPHSAYEQAFQGDESVRIDAMDVHVKA 

GRVYWTNWHTGTISYRSLPPAAPPTTSNRHR 

RQIDRGVTHLN1SGLKMPRGIAIDWVAGNVY 

WTDSGRDVIEVAQMKGENRKTLISGMIDEPH 

AIWDPLRGTMYWSDWGNHPKIETAAMDGT 

LRITLVQDNIQWPTGLAVDYHNERLYWADA 

KLSVIGSIRLNGTDPIVAADSKRGLSHPFSIDV 

FEDYTYGVTYINNRWKIHKFGHSPLVNLTGG 

LSHASDVVLYHQHKQPEVTNPCDRKKCEWL 

CLLSPSGPVCTCPNGKRLDNGTCVPVPSPTPP 

PDAPRPGTCNLQCFNGGSCFLNARRQPKCRC 

QPRYTGDKCELDQCWEHCRNGGTCAASPSG 

MPTCRCPTGFTGPKCTQQVCAGYCANNSTCT 

VNQGNQPQCRCLPGFLGDRCQYRQCSGYCE 

NFGTCQMAADGSRQCRCTAYFEGSRCEVNK 

CSRCLEGACWNKQSGD\TCNCTDGRVAPS 

CLTCVGHCSNGGSCTMNSKMMPECQCPPHM 

TGPRCEEHVFSQQQPGH1ASILIPLLLLLLLVL 

VAGVWVATCRRVQGAKGFQHQRMTNGAM 

NVEIGNPTYXMYEGGEPDDVGGLLDADFAL 
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Amino acid sequence (A=Alamne C=Cysteine, 
I>Aspartic Acid, E=Glutamic Acid, 
F=Phcnyl alanine, GKilycinc, H=Histidine, 

t lonlAiirin^ V"— T vxinf* I —I .CUCinC 

M=Methionine, N=Asparagine, P=Proline, 
r^=/^int!»minp R=Arffinine S = Serine. 

V s< /— VJlUUill 111JC, IV mgumiv, u < - ,vl 

T=Threonine, V=Valine, W-Tryptophan, 
Y=Tyrosinc, X-Unknown, *=Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 














DPDKPTNFTNPVYATL YMGGHGSRHSLAS 1 D 
EKRELLGRGPEDEIGDPLA 


374 


1724 


A 


3187 


191 


1815 


CLELASAGKPEESKALSLLAPAPTMTSLMPG 

AGLLPIPTPNPLTTLGVSLSSLGAIPAAALDPNI 

ATLGEIPQPPLMGNVDPSKIDE1RRTVYVGNL 

NSQTTTADQ1XEFFKQVGEVKFVRMAGDET 

QPTRFAFVEFADQNSVPRALAFNGVMFGDRP 

LKINHSNNAIVKPPEMTPQAAAKELEEVMKR 

VREAQSFISAAIEPGWLHSTSLCNDFLGCF*RR 

RMYRE*APCTICGTFHLCLIINWDL*LF*AYTA 

K*FFPPRVWT<EQ*KKRR\RSRSHTTvSKJSRSSSK 

SHSRRKRSQSKHRSRSHNRSRSRQKDRRRSK 

SPHKKRSKSRERRKSRSRSHSRDKRKDTREKI 

KEKER VKEKDRE Kh Rfc KbK±,Kb js±tf\x,KUisJN 

KDRDKEREKDREKDKJEKDREREREKEHEKD 

RDKJEKEKJEQDKEKEREKDRSKErDEKRKKDK 

KSRTPPRSYNASRRSRSSSRERRRRRSRSSSRS 

PRTSKTIKivKo bKbr or Ko KJN 1vJvi>»isj^jvilivx.£vx-» 

HISERRERERSTSMRKSSNDRDGKJEKXEKNST 

S 


375 


1725 


A 


3192 


415 


101 


AHSSHQTRAILQEFQWDIIRHPPLXSPNLALSG 
FVFPNLKKSLRG T HFSbvlsJs.\x ILl WL,n^Kjur 
WF/FFYP* SPDLQIPSSFlvNGLND WYHHSQKC 
PDLDGAYVKJC 


376 


1726 


A 


3199 


931 


418 


GV*WCDLGSPQPPPPGFKQFCLGRSSSWDYK 
HVPPHPANFVFLLETGFLHAGQAGLXGDPPAS 
ASQSAGITGVSHTWPKNHLIFYACLV1RSKR1 

K 


377 


1727 


A 


3201 


274 


1285 


KTGYTSRGSPLSPQSSIDSELSTSELbDUSISM 

GYKXQDLTDVQIMARLQEESLRQDYASTSAS 

VSRHSSSVSLSSGKKGTCSDQEYDQYSLEDEE 

EFDHLPPPQPRLPRCSPFQRGIPHSQTFSSIREC 

RRSPSSQYrrbNN I Wwww * ior^AVHwvv^ 

NRTNGDK/PPKK Y A* PSPDAKYNCH* * QHXSSP 

VTVRNSQSFDSSLHGAGNGISR1QSCIPSPGQL 

QHRVHSVGHFPVSIRQPLKATAYVSPTVQGSS 

NMPLSNGLQLYSNTGIPTPNKAAASGIMGRS 

ALPRPSL AING SNLPRSKIAQP VRSFLQPPKPL 

S SLSTLRDGN WRDGC Y 


378 


1728 


A 


3202 


112 


1789 


"VPGVTESRPSVLRGDHLFALLSSETHQEDPIT 
YKGFVHKV\ELDRVKLSFSMSLLSRFVGWG* 
PFKVNFY/TFNRQPLJR.V\QHRALELTGRWLLW 
PMLFPWAPRDVPLLPSDVKLKLYDRSLESNP 
EQLQAMRHIVTGTTRPAPYICFGPPGTGKTVT 
LVEAIKQVVKHLPKAHILACAPSNSGADLLC 
QRLRVHLPS SI YRLLAPSRDIRMVPEDIKPCCN 
WDAKKGEYVFPAJ<J<ja.QEYRVLnTLITAGR 
LVSAQFPIDHFTHIFIDEAGHCMEPESLVAIAG 
LMEVKETGDPGGQLVLAGDPRQLGPVLRSPL 
TQKHGLGYSLLERLLTYNSLYKKGPDGYDPQ 
irrrvT I PMVT3 nWPTTI DTPNOI YYRGELOACA 
DVVDRERFCRWAGVLPRQGFPUFHGVMGKX) 
EREGNSPSFFNPEEAATVTSYLKLLLAPSSKK 
GKARLSPRSVGVISPYRKQVEKIRYaTKLDR 
ELRGLDDIKDLKVTCCSTVTPCLPCAPTCPLP 
ETSSSFHSSPRPRPTPAALNRARALPEPLTPGD 
SNLRVWDGIRKPACLTNTSCHS 


379 


1729 


A 


3206 


432 


130 


PKAAPS VXL WFPPFL* GSFKPTKGHTXCVX1K 
♦LSTR£AXDSXPGRQIAXXRQGGK\^TTTAL 
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F=Phenylalanine, G=Glycine, HHHistidine, 
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Y=Tyrosinc, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 














XKQSNNKGTRASSYXEPDAXEQWKFPHKKL 
QLPGXTHE 


380 


1730 


A 


3207 


187 


507 


GGTGHPHPARPPLSGVGGCQCSHSKPWTAGS 
PEQRDHPAPHKQIEAGQGLPGPQAWGG*K.GP 
AXLLPGPGGGPGPVASLEARAQASSGVTPNG 
GGRTYP YPTF S SGE 


381 


1731 


A 


3225 


1 


840 


GTRPGHLPAPSDGFCV/HL*SIPSWGSF* GESLV 

EMQLITSLGLQEFDIARNVLELIYAQTLVWIGI 

FFCPLLPFIQMIMLFIMFYSKNISLMMNFQPPS 

KAWRASQMMTFFIFLLFFPSFTGVLCTLAITI 

V^ r RLKPSAE>CGPFRGLPLFlHSIYSV/IDTLSTRP 

GYLWVVWIYRNL1GSVHFFFILTL1VLIITYLY 

WQITEGRKIMIRLLHEQIINEGKDKMFLIEKXI 

KLQDMEKKANPSSLVLERKEVEQQGFLHLGE 

HDGSLDLRSRRSVQEGNPRA 


382 


1732 


A 


3238 


256 


38 


" LLMlKVSSTCFSCHLHHHHHHHHRHHQUHNb 
LFFSLKSSSNSSTLPVYLSYNIILVFSKCLVFDF 

LFSNACL 


383 


1733 


A 


3241 


1542 


343 


KGAPSFVRLYQYPNFAGPHAALANKSFFKAU 

KVTMT.WNKKATAVLVIASTDVX)KTGASYYG 

EQTLHYIATNGESAWQLPKNGPIYDVVWNS 

SSTEFCAVYGFMPAKATIFNLKCDPVFDFGTG 

PRNAAYYSPHGHILVLAGFGNLILQI*AD/IMK 

VWNVKNYKLISKPVASDSTYFAWCPDGEHIL 

TATCAPRLRVNNGYKIWHYTGSILHKYDVPS 

NAELWQVSWQPFLDGIFPAKTITYQAVPSEVP 

NEEPKVATAYRPPALRNKPITNSKLHEEEPPQ 

NMKPQSGNDKPLSKTALKNQRKHEAKKAAK 

QEARSDKSPDLAPTPAPQSTPRNTVSQSISGDP 

EIDKiOKNLKKKJLKAIEQLKEQAATGKQLEK 

NOLEKIQKETALLQELEDLELGI 


7 Sid 


1 714 
i / jt 


A 


3242 


3 


678 


IRSPAARSPGLETPTCLLFVIAAIAAVFVDSA11" 

RLTQHRPQDGSFPYTILDPPLYLPGQCAPPQP 

LSQCARRVHGEKLRRPTFGPRHRGAGTAKMS 

ASLVRATVRAVSKRKLQPTRAALTLTPSAVN 

KIKQLLKDKPEHVGVKVGVRTRGCNGLSYTL 

EYTKTKGDSDEEVIQDGVRVFIEKKAQLTLL 

GTEMDYVEDKLSSEFVFNNPNIKGTCGCGES 

FMT 


385 


1735 


A 


3243 


3190 


664 


" VAMGTPRAQHPPPPQLLFLlLLSCPWigULl'L 
KEEEILPEPG SETPTV ASE AL AELLHG ALLRR 
GPEMGYLPGPPLGPEGGEEETTTTIITTTTVTT 
TVTSPVLCNNNISEGEGYVESPDLGSPVSRTL 
GLLDCTYSIHVYPGYGIEIQVQTLNLSQEEELL 
VLAGGGSPGLAPRLLANSSMLGEGQVLRSPT 
NRLLLHFQSPRVPRGGGFRIHYQAYLLSCGFP 
PRPAHGDVSVTDLHPGGTATFHCDSGYQLQG 
EETLICLNGTRPSWNGETPSCMASCGGT1HNA 
TLGRJVSPEPGGAVGPNLTCRWVIEAAEGRRL 
HLHFERVSLDEDNDRLMVRSGGSPLSPVTYDS 
DMDDVPERGLISDAQSLYVELLSETPANPLLL 
SLRFEAFEEDRCFAPFLAHGNVTTTDPEYRPG 
ALATFSCLPGYALEPPGPPNAIECVDPTEPHW 
NDTEPACXAMCGGELSEPAGWLSPDWPQS 
YSPGQDCVWGVHVQEEKRILLQVEILNVREG 
DMLTLFDGDGPSARVLAQLRGPQPRRRLLSS 
GPDLTLQFQAPPGPPNPGLGQGFVLHFKEVPR 
NDTCPELPPPEWGWRTASHGDLIRGTVLTYQ 
CEPGYELLGSDDLTCQWDLSWSAAPPACQKI 
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MTCADPGEIANGHRTASDAGhPVubH v(^YKC 
LPGYSLEGAAMLTCYSRDTGTPKWSDRVPKC 
ALKYEPCLNPGVPENGYQTLYKHHYQAGESL 
RFFCYEGFEL IGE VTITC VPGHPSQ WTSQPPLC 
KVTQTTDPSRQLEGGNLALAILLPLGLV1VLG 
SGVYIYYTKLQGKSLFGFSGSHSYSPITVESDF 
SNPLYEAGDTREYEVSI 


386 


1736 


A 


3250 


5725 


3984 


GTSTVTMATKXHFSITLNLLGMLLKKDNQDT 

RKLLMTWALEVAWMKKSETYAPLFCLPSF 

HKJCKGLLADTLVEDVNICLQACSSLHALSSS 

LPDDLLQRCVDVCRVQLVHRGTCIRQAFGKJL 

LKSIPLGVFLSNNNHTEIQE1SLALRSHMSKAP 

SNTFHPQDFSD/VISFILYGNSHRTGKDNWLE 

RLFYSCQRLDKRDQSnPRNLLKTDAVLWQW 

AIWEAAQFTVLSKLRTPLGRAQDTFQTIEGIIR 

SLAGHTLNPDQDVSQWTTADNDEGHGNNQL 

RLVLLLQYLENLEKLMYNAYEGCANALTSPP 

KVIRTFLYTNRQTCQDWLTRIRLSIMRVGLLA 

GQPAVTVRHGFDLLTEMKTTSLSQGNELEVSI 

MMVVEALCELHCPEAJQGIAVWSSSIVGKHL 

LWINSVAQQAEGRFEKASVEYQEHLCAMTG 

VDCCISSFDKSVLTLASAGCK5ASLKHCLNGE 

SRKSVLSKPTDSSPEVrNYLGNKACECYISTA 

DWAAVQEWQNAIHDLKKSTSSTSLNLKADF 

NYIKSLSSFESGKFVECTEQLELLPGENINLLA 

GGSKEKJDMKKLLRNM 


387 


1737 


A 


3255 


380 


76 


MDIFLYNCKYQVQTEI*NSIQHIMA\SKKLSRF 
LKYVHNL* AENYKTLMK*INEDLNKQRDVPY 
S * TARLNKMSIPTKTIFRFKAIYIKIP ATYFI ET 
NMQ 


388 


1738 


A 


3260 


685 


428 


PQWLGLQVYALPPANFVFFVtMKb 1 LLAy 1 U 
FELLDSSDLPASASKSAGITCMSHHARTLSLK 
* WPFCLS ATQEKFC * P ASEG V A W 


389 


1739 


A 


3269 


1 


332 


LDG YHTPIYMLNRIIRLPAAL* IISDQTGHALTI 
LTRLETQMINADYQNKLTLD YLLTTDREVY E 
PFNLTNYCLHIHNQRLGAYDLG* V* Q/KLAHV 
PVQV*HGFDPEAMFR 


390 


1740 


A 


3270 


2 


372 


GRCHDQNKGKS\DGPDAQAEACGGESTYQEL 
LWQ^TIGQPLACRRLTRKIYEGIKKAVKPNH 
SPRGVXKVHKPVNKGEKGIMVLAGU I Lulu V 
YCLLPCMC*DRKLTYAHIPSTTDLGAGAGY 


391 


1741 


A 


3273 


1 


187 


FFQEMLDIMKA1SDMMGKCTYPVLKEDAPRQ 
HVETFFQXEELTRSQEGMKLGENFLMFAMPP 
DDSKESKGK'FFQEMLDlMKJUbUMMUKL, l i 
PVLKEDAPRQHVETFFQVGINQKSRGHEVRR 
KFPDVCHAPR 


392 


1742 


A 


3281 


901 


521 


FFFGDGVSPCRQAGV* WHDLDSLQNLPPGFK 
RFSYLSLPSSW\DYRHVLPRQANFCIF/M*RRG 
FTMLARMVbls'rKIJLJrAl-AbvbAUl I u v onn 
APPQMDFTFALLCFALKGCLPRQKEGGTLNLI 


393 


1743 


A 


3283 


385 


3 


RNRSVVPEFV1XGLSAGPQTQTLLFVLFVVIC 
LLTVMGNIXLLVVINADSCLrrrPMYFFLGQL 
SFLDLCHSSVTAPKLLENLLSEKKTISVEGCM 
A* VFFVFATGGTESSLLAVMAYDRYVAIRTR 

G 


394 


1744 


A 


3284 


575 


1054 


CTKCKADCDTCFNKNFCTKCKSGFYLHLGKC 
LDNCPEGLEANNrTTNIECVSIVHCEVSEWNP 
WSPCTKKGKTCGFKRGTETRVREIIQHPSAKG 
NLCPPTNETRKCTV QRKKCQKGERGKKGRE 
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RKRKKPNTCGESKEAlPDSKSLESSKJBIPHQKhN 
KQQQ 


395 


1745 


A 


3286 


1 


340 


RVLYVPSMGFCILVAHGWQKISTKSVEKKLS 
WICLSNmLTHSLKTFHRNWDWESEYTLFMS 
ALKVNKNNAKLWNNVGHALENEKNFERAL 
KYFLQATHVQPDD1GAHMNVGR 


396 


1746 


A 


3293 


1 


172 


GFRA WMTVK 1 1 AAKu lL.li oKivmuM v /viiv 
IAFMKQRRMGLNDFIQKIANNSYACKQ 


397 


1747 


A 


3295 


12 


401 


AEPACGASSCTPPSLRSSSSQSVGPLRPGRPL 
WSEACAFL*AAAPQGPASPCCGLPSGFPRVW 
AQCCPPGGALRFPEGLGSVLSPRRCPQVSRGS 
GLSAVPQEVPSGFLGPGLRACPQEAPSRFLRA 

GLT 


398 


1748 


A 


3300 


1912 


2768 


KQRRWQNlQRKGPKRYIVlAGNSQSHQPMIh S 

MLRKJLPKVTCRDVLPEIRAICIEEIGCWMQSY 

STSFLTDSYLKYIGWTLHDKHREVRVKCVKA 

LKGLYGNRDLTARLELFTGRFKI)WMVSMIV 

DREYSVAVEAVRLLILILKNMEGVLMDVDCE 

SVYPIV*ASN*GLASAVGEFLYWKLFYPECEI 

RTMGGREQRQSPGAQRTFFQLLLSFFVESKSH 

SVTQAGVQWQFSAHRDLCLPGSSNSHVSASR 

VAGIAGAHRHTWLIYVFFSWRQGFAVLAGL 

VSNS 


399 


1749 


A 


3301 


536 


2391 


LRSYGCKAPSRISHLHKXFLFLLLPSLLMGYbJt 

SPPPITDS W APFI SLTHHVLSQSQSPLSSNC W3 

CLSTHTQ* FT ALP ADLLTWTQSNVSLHIS YLAI 

PFLADSFLKPV/L*PGNSAKHLSFKLSSLSMVS 

GRAVALLHLIASGLTSIQTNTASSKPPIWGY\L 

STQTSFISPPPLCLSRTYPNPAHATMVGQVPQ 

SLCGLIFTL/RTPCRPSILHPNYKIISTSAWQKV 

LCFSGSPTIHTSLHLTTGSSFLSFHPIPGFPAAN 

S ALY VSSLKGPPGKNVTIPSPVTGT* QPPHRGS 

N/RLTVDKDNFFLSPKPNSLHQLPSQ\TPYQAL 

TGAALAGSYPIWENENTLSWLPTFTYNFCLST 

PSLFFLCDTN*YLCLPANWSGTCTLVFQAFTI 

NILPPNQTILISVEASISSSPIRNKWALHLITLLT 

GLGITAALGTGLAGU l SITbYQILh 1 ILbiNi vc 

DMHTSITSLQRQLDFLVGVILQNWRVLDLLT | 

TEKGGTCIYLQEECCFCVNESGIVHIAVRRLH 

DRAAEL* HQ V ADS W WQGS SLLR WIP WV APF 

LGPLIFLFLLLMIGPCIFNLVSRFISQRLNCFIQ 

ASMQKHIDNIrrlLCriV* Y i^oLKOlNnoi^rvriii i\. 

P 


400 


1750 


A 


3303 


2 


1 453 

i 

1 


THWRHSSGVPGSTTARRRRRELEIATSUNQb 

YYKRLCQEVTNRERNIX)KMLADLDDLNRTK 

KYLEERLIELLRDKDALWQKSDALEFQQKLS 

A^ERWLGDTEANHCLDCKREFSWMVRRHHC 

RJCGRFCYYCCNNYVLSKHGGKKERCC 


401 


1751 


A 


3304 


1 


626 

i 
t 


MAPQHSSLDDKVP^L^ASs I v^mr v^iL^nov 

CTEHKDSLWGPGARSQPFGAHNTRLSPDSCP 

EKIVLRALKDSRAGMPEQDKDPGVQENPDD 

QRRVPQGTGDAPSAFRPLWDNGGLSPFVSRP 

GPLERDLHAQRSEVTYNQRSQSSWMSSFPKR 

NAFVSPYSSMGQAQP/GLPKTNPIGESCCWEG 

LSLSTQILG* QKPSK YJPSLCKR 


402 


1752 


A 


3305 


1678 


172 


MELPSGPGPERLFDSHRLPGDCFLLLVLLLYA 
PVGFCLLVLRLFLGIHVFLVSCALPDSVLRRF 
VVRTMCAVLGLVARQEDSGLRDHSVRVLI5N 
HVTPFDHNmiLLTTCSTVSESEAESATGRFP 
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Amino acid sequence (A«=Alamne OCysteinc, 
[>Aspartic Acid, EOlutamic Acid, 
F=Phenylalanine, GOlycine, H=Histidine, 
l=Isoleucrne, K=Lysine, ^-Leucine, 
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Y=Tyrosinc, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possib!e 
nucleotide insertion 














GAQLKAPLSPLAFRMEDTEALPLTPIL YFl 

FFFFuTLNIFLLAFSSPGSQPLLNSPPSFVCWSR 

GFMEMNGRGELVESLKRFCASTRJLPPTPLLLF 

PEEEATNGREGLLRFSSWPFSIQDWQPLTLQ 

VQRTLVSVTVSDASWVSELUWSLFVPrTVY 

„, rr> ii rr DD\/LT0m nKAlSTFPFAl R VOOYLV AJCE 

QVKWLKPVHKyj^OE~Ai^i-^^'^ IS ' v ^ r ^ L - 

LG\QTGTRLTPA\DKAEHMKRQRHPR\LRPQS 

AQSSFPPSPWVLSS/SDVQTGQTLGFREFKESF 

CPHV AIG VFIPERP WPKTG CCKTLTIHLILL* G 

GPVSFSCPE\DIHPRGT*VPTQQASGLPSFPSYG 

PARGGVL*HPSAQQPL I r A\Koc>\ vv AK/\urv^x, 

OERKOVALYEYARRRFTERRAPGGLD 


403 


1753 


A 


3307 


44 


447 


DPSPSLLAVALGLRAGERIRSGPGSSSPSCjUIS) 
GGASAGLASSPECACGRSHFTCAVSALGECT 
CIPAQWQCDGDNDCGDHSDEDGCILPTCSPL 

_ _ _ . „ _ _ ^*,m noun mFkCriXTHPCT^nCnP 

DFHCDNGKCIRRSWVCDSDNDCfciJU2»ur,v A - ; 
rPPRF.rF.KD 


404 


1754 


A 


3311 


409 


1 


PRHG WGRRVLGRDRPRLQKVKKS VKA1 Y 1FO 
QDHVQNEEIYARVLDKFGSNFLSRDNADLGT 
AFVKFSTLTK*LSALLKNLLQGLSRNVIFTLDS 
LLKGDLKGVKGDLKKPFDKAWKDYEllU'AK 

1EKEKRERE W R 


405 


1755 


A 


3322 


12 


458 


AAVPVENPWDDPRVRPRVRIFTWEDClAUgA 
KVLCNDSYGVT1DWSPKGAFIRLTSQSVGNG 
HPASKENDQMVDTIKNTTKVPIIWTYGDMVE 
PRPQMIRP A VGAKHKEL WKILMALKKIKXI w c 
GKYTKPSQYNPNYMLELAHNDSVW j 


. 406 


1756 


A 


3324 


1 


426 


LSMLSTISTEHRLSVLWPlWYCCHCPTHLbAV 
MCVLLWALSLLQSILEWMFCSFLFSDVDSDN 
WCQILDFLTAVWLIFLIXLVLCGFTLVLLVRIIC 
GSQKMPLTRLYVTILLTGLVFLFCSLPLSIQ*F 
LLYWIEKDLDDL 


407 


1757 


A 


3328 


213 


1841 

i 


SGDLSPAFXMMLTIGDVIKQLIEAHEQUKJJI^ 

LNKVKTKTAAKYGLSAQPRLVDI1AAVPPQY 

RKVLMPKLKAKPIRTASGIAWAVMCKPHRC 

PHISFTGN1CVYCPGGPDSDFEYSTQSYTGYEP 

TSMRA1RARYDPFLQTRHRIEQLKQLGHSVD 

KVEFIVMGGTTMALPEEYPJ)YFIRNLHDALS 

GHTSNNIYEAVKYSERSLTKCIG1TIETRPDYC 

MKRHLSDMLTYGCTRLEIGVQSVYEDVARD 

Th^GHTVKAVCESFHLAKDSGFKWAHMMP 

DLPNVGLERDIEQFTEFFENPAFRPDGLKLYP 

TLV1RGTGLYELWKSGRYKSYSPSDLVELVA 

RIL AL VPP WTRVYRVQKIJlriMx' v sou v cnu 

NLRELALARMKBLGIQCRDVRTREVGIQEIH 

HKVRPYQVELVRRDWaNGGWIHTLSYEDP 

DQDILIGLLRLRKCSEETFRFELGGGVSIVREL 

HVYG S WP VS S RDPTKFQHQGFGMLLMEE A 

t--p>t a r»Trt?ij/^criVT a VT^rTVfTTRMYYRICIGYRL 
ERIAREEHObtjKiAV lesu vui jvjn i i ivi\j.*j a ^ 

OGPYMVKMLK 


408 


1758 


A 

i 


3335 


3 


467 


AUSPRAAGIRHELTSTMAAGKNKRLTKCiUK 
KGAKKKAV/DNIINIGKTLVTRTQRTKIASDG 
LKGR VFEESL ADLQNTXTDG YLLRV1* VAFTT 
ERTOQI/REVF?^IPDSIGKDIEKACQSrYPLH 
DDFARKVKMLKKPKFELRKLMELHGEGSS 


409 


1759 


A 


3338 


7 


1252 


" PRWRNSARDEILLSFPQNYYIQWLNGSLLHUL 
WNLASLFSNLCLFVLMPFAFFFLESEGFAGLK 
KGIRARILETLGMLLLLALL1LGIVWVASALID 
NDAASN1ESLYDLWEFYLPYLYSCISLMGCLL 
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Y=Tyrosine, X=Unknown, *=Stop codon, 
/-possible nucleotide deletion, \=possible 
nucleotide insertion 














LlXCTPVGL\SRMFTVMGQLLVKPTILtULDh 
QIY1ITLEEEALQRPTKWAVFIRW/KYNIMELE 
QELENVKTIuKTKLERRKKASAWERNLVYPA 
VNWLLLIETS1SVLLVACN1LCLLVDETAMPK 
GTRGPGIGNASLSTFGFVGAALEIILIFYLMVS 
SWGFVSLRFFGNFTPKKDDTTMTKIIGNCVS 
ILVLSSALPVMSRTLGITRFDLLGDFGRFNWL 
GNFYIVLSYNLIJAIVTTLCLVRKFTSAVREE 
LFKALGLHKLHLPNTSRDSETAKPSVNGHQK 

AT 


410 


1760 


A 


3339 


127 


1433 


G SHRFSL ASPLDPE V GPY CDTPTMRTLFN LL 

WLALACSPVHTTI^KSDAKKAASKTLLEKSQ 

FSDKPVQDRGLVVTDLKAESWLEHRSYCSA 

KARDRHFAGDVLGYVTPWNSHGYDVTKVFG 

SKFTQ1SPVWLQLKRRGREMFEVTGLHDVDQ 

GWMRAVRKHAKGL\P*CLGSCLRTGLTMISG/ 

YVLDSEDEIEELSKTWQVAKNQHFDGFVVE 

V WNQLL SQKRVGLIHMLTHLAE ALHQ ARLL 

ALLV JJPPATTPGTDQLGMFTHKEFEQLAP VLD 

GreLMTYDYSTAHQPGPNAPLSWVRACVQV 

LDPKSKWRSK1LLGLNFYGMDYATSKDAREP 

WGARYIQTLKDHRPRMVWDSQVSEHFFEY 

KKSRSGRHWFYPTLKSLQVRLELARELGVG 

VSI WELGQGLD YFYDLL* VGIAASAVDVFFSK 

PWSE 


411 


1761 


A 


3342 


74 


2701 


VATRKLAKGFTQFAK^TEGTKKTSIO^KFI-K 

FKGFGSFSNLPRSFTLRRSSASISRQSHLEPDTF 

EATQDDMVTVPKSPPAYARSSDMYSHMGTM 

PRPSIKKAQNSQAARQAQEAGPKPNLVPGGV 

PDPPGLEAAKEVMVKATGPLEDTPAMEPNPS 

AVEVDPIRKPEVPTGDVEEERPPRDVHSERAA 

GEPEAGSDYVKJFSKEKY1LDSSPEKLHKELEE 

ELKLSSTDLRSHAWYHGRIPREVSETLVQRN 

GDFLIRDSLTSLGDYVLTCRWRNQALHFKIN 

KWVKAGESYTHIQYLFEQESFDHVPALVRY 

HVGSRKAVSEQSGAIIYCPVNRTFPLRYLEAS 

YGLGQGSSKPASPVSPSGPKGSHMKRRSVTM 

TDGLTADKVTRSDGCPTSTSLPRPRDSIRSCA 

LSMDQIPDLHSPMSPISESPSSPAYSTVTRVHA 

APAAPS ATALPASP VARRS SEPQLCPGS APKT 

HGESDKGPHTSPSHTLGKASPSPSLSSYSDPDS 

GHYCQLQPPVRGSREWAATETSSQQARSYGE 

RLKELSENGAPEGDWGKTFTVPIVEVTSSFNP 

ATFQSLL1PRDNRPLEVGLLRKVKELLAEVDA 

RTLAJIHVTKVDCLVARILGVTKEMQTLMGV 

RWGMELLTLPHGVRKLRLDLLERFHTMSIML 

AVDILGCTGSAEERAALLHKTIQLAAELRGT 

MGNMFSFAAV^GALDMAQISRLEQTWVTLR 

QRHTEGAJXYEKKLKPFLKSLNEGKEGPPLSN 

TTFPHVLPLITLLECDSAPPEGPEPWGSTEHGV 

I EWLAHLEAARTVAHHGGLYHTNAEVKLQG 
FQARPELLEVFSTEFQMRLLWGSQGASSSQA 

! RR YEKFDK VLTAL SHKLEPA VRS SEL 


412 


1762 


A 

1 


3347 


1 


898 


t np* "a a f C R TK PLPMAVSIRGN ADS IV ACL VLM 
\aYLIKKRLVACAAVFYGFAVHMKIYPETYI 
i LPITLHLLPDRDNDKSLRQFRYTFQACL*ELL 
; KJILCKRTALMPVAVAGLTFFALSFGFYYEYG 
1 WEFLEHTYFYHLTRRDIRHNFSPYFYMLYLT 
i AESKWSFSLG1AAFLPQLILLSAVSFAYYRDL 
i \TCWFLHTSI^VTFNKVCTSOYFLWYLCLLPL 
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VN^LVRMPWKRAVVLLMLWFIGQAMWLAP 
AYVLEFQGKNTFLFIWLAGLFFLLINCSILIQII 
SHYKEEPLTERDCYD 


413 


1763 


A 


3361 


3 


474 


PIP VR WN SLEGRLLRG YEQHANDGKD Y 1 S KN 

♦DLRSWTAADMAAQnXRKWEAEEFAEQlKA 

YLEGTCVER/LRTHLENGKETLQLTEQSSQPTI 

PIVGIVAGLVLLGAWTGAWSAVMCRKKNS 

GHFLPTDRVSYSEAASSDHAQGSDVSLTACK 

V 


414 


1764 


A 


3363 ! 


1488 


453 


HQ1LELKKKILKTYNPDYDEDL VQEAS SED VL 

GVHMVDKDTERDIEV1KRQLRRLRELHLYST 

WKKYQEAMKTSLGVPQRERDEGSLGKPLCP 

PEILSETLPGSVKKRVCFPSEDHLEEFIAEHLP 

EASNQSLLTVAHADAGTQTNGDLEDLEEHGP 

GQTVSEEATEVHMMEGDPDTLAELLIRDVLQ 

ELSSYNGEEEVDPEEVKTSLGVPQRGDLEDLE 

EHVPGQTVSEEATGVHMMQVDPATLAKSDL 

EDLEEHVPEQTVSEEATGVHMMQVDPATLA 

KQLEDSTITGSHQQMSASPSSAPAEEATEKTK 

VEEEVKTRKPKKKTRKPSKKSRWNVLKCWD 

IFNIF 


415 


1765 


A 


3369 


431 


315 


""lPWSWVGRLSVRKMSILF*LTYNYNAILNKTP 
PSFSPSL 


416 


1766 


A 


3373 


42 


651 


RQEKMGLGEIGASGVLRSMLKERKKQNMKG 
NGNVTLTPLLPAVQCGCHLQPAGRSPLPSSHS 
APGLCSPLHPLQPQQEASTCPSGTLQGREKAA 
PGQGRPLCSLWAGGAGAVPGERGAEGRGPSD 
Q APDPKSGPWLFPPGLGAPAE VRLHNVPHNL 
RRPPLP*ARGK*PPNSGCPWSEGRAKQPLSCG 
PKPQCSLPSQVPGDTH 


417 


1767 


A 


3382 


2 


2061 


EAQDPRACGPDAGGRFAARDAPGNSLRPPPS 

SPP/GWPGQLRLLPRVPGSELRCGKPERGRLP 

ASPPGKIRGWPPGISKRPGLGGRSFPPGFAPRT 

WRPEARGPSVQSLPPIFSPQSAQTTAR*RPGAP 

KNAGRCGGA\RGPRLSLGPPPGPPPAPALPAR 

ASAGAGAAAAALAVGGVRGAGGARGTGGY 

GHCSGR/PTGRTGPGPQGPGPPMPARPR*AS\S 

TRGSRRGPGSRPARAAAAPRAGDHGRRPVRV 

HLRQHTA V* EPRLGDATAPPGG AAGPG AP AP 

R\GPGWDCALLPSPGPRSPRAVGCAEPEIWDP 

S PRRGTSP VPS VRSLRS EP ANPRLGLP ALLN S Y 

PLKGPGLPPPWGPRTQTGHVHTVQPSGSCIEH 

SKSLD/RGPWGAPPWGPSSSGLCSPKLATAGP 

PQ S WGLCQIGRRRGLG GPGLKRGET/GLL* G C 

SMDHANRTKGPGVPTSNRCFSHIPG\GDGCSD 

HSSCEGHPDLHAGREMPAAPGLSELERVRFT 

VGCGGLASGISSASVSGLSPNRAGGPGQGDW 

EMYPVSWQTQESGGQG/SPKTGR*VGMLQA 

GAGSLQGGTGDGVWGLWEDGP/RG*DSPLPS 

GTGTEP*TPTTSIPFFPQPSGVYPSRATLLPMPS 

Y*ALGPSANKSEKPLLSFLYRGLCCRJSLQLA 

KGIGQLSEIPLLNVETAFWSMWVTYFRK 


418 


1768 


A 


3398 


304 


") 1 O 1 


EEEEEEEDEDDDDNNEEEEFECYPPGMKVQV 
RYGRGKNQKMYEASIKDSDVEGGEVLYLVH 
YCGWNVRYDEWIKADKIVRPADKN\TK1KH 
RKKIKNKLDKEKDKDEKYSPKNCKPPALGPN 
PPFQTNP1SWKWYPKLDLTDAKNSDTAHIKSI 
EITSILNGLQASESSAEDSEQEDERGAQDMDN 
NGKEES KIDHLTKNRNDLI SKEEQNS S SLLEE 
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NKVHADLVISKPVSKSPERLRKDIEVLSEDTD 
YEEDEVTKKRKD VKKDTTDKS SKPQ1KRGKR 
RYCNTEECLKTGSPGKXEEKAKNKESLCMEN 
SSNSSSDEDEEETKAKMTPTKKYNGLEEKRK 
SLRTTGFYSGFSEVAEKRIKLLNNSDERLQNS 
RAKDRKDVWSSIQGQWPKKTLICELFSDSDTE 
AAASPPHPAPEECr V AbtoJLCJ 1 v/\iir»iio^orov 
ELEKPPPVNVX)SKPIEEKTVEVNDRKAJEFPSS 
GSNFSA* IPLPYLHLNRLHQSL* QKGSRQQSS 
VTV SEPLAPNQEbVKbilvor. 1 us 1 in v v auc 
LQDLQSERE* LASRF*CQCELKQ** SARTRTS* 
KSLYRSEKSERCSGRRKFIKKAEKKP*SNSGK 
QQKEGK 


419 


1769 


A 


3399 


206 


463 


QRECLSIHIGQAGIQIGD ACWbL Y Cbkriuivjr 
NGVVLDTQQDQLENAKMEHTNASFDTFFCE 
TRA GKHVPRAJLF VDLEPTVIDGIR 


420 


1770 ! 


A 


3408 


1010 


685 


RRLSFFF* IWSSVLVTQARVQWRDLGSPQPLP 
PGFKRFSCLSLPSSWDYRHPSPRPVNF/HVFLV 
VMGFHHVGQAGLELLTSGDLPALASQSAR1T 
GVNHCAQPRGHFH 


421 


1771 


A 


3409 


355 


1326 


ADSNLIESCWQELGLGPWGGDWRVEQVCiAS 

ASLRFPREVCSIRFLFTAVSLLSLFLSAFWLGL 

LYLVSPLENEPKEMLTLSEYHERVRSQGQQL 

QQLQAELDKLHKEVSTVRAANSERVAKLVF 

QRLNEDFVRKPDY ALSS VGASIDLQKTbrlD Y 

ADRNTAYFWNRFSF\^WARPPTVILEPHYFP 

GNCWAFEGDQGQWIQLPGRVQLSDITLQHP 

PPSVEHTGGANSAPRDFAVFFLLSFFTHQGLQ 

VYDETEVSLGKFTFDVEKSEIQTFHLQNDPPA 

AFPKVKJQILSNWGHPRFTCL YK VKAHu VKi 

SEGAEGSAQGPH 


422 


1772 


A 


3412 


2 


421 


EFDAQPSIGALWFKRP*ATTGSDPGPKRGMN 

YLVSCSMRSPESGKGEPGTARDYTPMGRPPP 

PVPSVSPGPLPGSLAIAPHSPEPHPWEQQPPRG 

QARSPPGGWLGSAT/RVRRPHNHP/RGH/HSP 

VDTAGAPASPGPDVCE 


423 


1773 


A 


3420 


91 


706 


DAQRAlYSSVuPAVbLK^K^yjJUAvivDauiu 
RGGVRSFSRAAAAMAPIKVGDAIPAVEVFEG 
EPGNKVNLAELFKGKKGVLFGVPGAFTPGCS 
KTHLPGFVEQAEALKAKGVQWACLSVNDA 
FVTGEWGRAHKAEGKVRLLADPTGAFGKET 
DLLLDDSLVSIFGNRRLKJRFSMVVQDGIVKA 
LNVEPDGTGLTCSLAPNIISQL 


424 


1774 


A 


3421 


4 


7688 


" "RQVTRVGTRVLGSTl'AAVFLSVEDDNDNAPQ 
FSEKRYWQVREDVTPGAPVLRVTASDRDKG 
SNAVVHYSIMSGNARGQFYLDAQTGALDVV 
SPLDYETTKEYTLRVRAQDGGRPPLSNVSGL 
VTVQVLDINDNAPIFVSTPFQATVLESVPLGY 
LVLHVQAIDADAGDNARLEYRLAGVGHDFP 
FTINNGTGWISVAAELDREEVDFYSFGVEAR 
DHGTP ALT ASA S VS VTALDVNDNNPTFTQPE 
YTVRLNEDAAVGTSWTVSAVDRDAHSVITY 
QITSGNTRNRFSITSQSGGGLVSLALPLDYKLE 
RQYVl^VTASDGTOQDTAQIVVNVTDANTH 
RPVFQSSHYTVNVNEDRPAGTTVVLISATDE 
DTGENARTTYTMEDSIPQFRIDADTGAVTTQA 
ELDYEDQVSYTLAITARDNGIPQKSDTTYLEI 
LVNT^VhnDNAPQF^RDSYQGSVYEDVPPFTSV 
LQISATDRDSGLNGRVFYTFQGGDDGDGDFI 
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! 








"VESTSGIVRTLRRLDRENVAQYVLRAYAVDK 

GMPPARTPMEVTVTVLDVNDNPPVFEQDEFD 

VFVEENSPIGLAVARVTATDPDEGTNAQIMY 

QIVEGNPEVFQLDIFSGELTALVDLDYEDRPE 

YVLVIQATSAPLVSRATVHVRLLDIWDNPPV 

LGNFEIXFNNYVTNRSSSFPGGAIGRVPAHDP 

DISDSLTYSFERGNELSLVLLNASTGELKLSR 

ALDNNRPLEAIM S VL V SDG VH S VT AQC ALR V 

TIITDEMLTHSITLRLEDMSPERPLSPLLGLFIQ 

AVAATLATPPDHVVVFNVQRDTDAPGGHILN 

V SL SVG QPPGPG GGPPFLPS EDLQERL YLNRS 

LLTAISAQRVLPFDDNICLREPCENYMRCVSV 

LRFDSSAPFIASSSVLFRPIHPVGGLRCRCPPGF 

TGDYCETEVDLCYSRPCGPHGRCRSREGGYT 

CLCRDGYTGEHCEVSARSGRCTPGVCKNGGT 

CVNLLVGGFKCDCPSGDFEKPYCQVTTRSFP 

AHSFITFRGLRQRFHFTLALSFATKERDGLLL 

YNGRFNEKHDFVALEVIQEQVQLTFSAGEST 

TTVSPFVPGGVSDGQWHTVQLKYYNKPLLG 

QTGLPQGPSEQKVAVVTVDGCDTGVALRFGS 

VLGNYSCAANQGTQGGSKKSLDLTGPLLLGG 

VPDLPESFPVRMRQFVGCMRNLQVDSRHIDM 

ADFI ANNGTVPGCPAKXN VCD SK.TCHNGGTC 

VNQWDAFSCECPLGFGGKSCAQEMANPQHF 

LGSSLVAWHGLSLPISQPWYLSLMFRTRQAD 

GVLLQATTRGRSTITLQLREGHVMLSVEGTGL 

QASSLRLEPGRANDGDWHHAQLALGAIGGP 

GHAILSFDYGQQRAEGNLGPRLHGLHLSNITV 

GGIPGPAGGVARGFRGCLQGVRVSDTPEGVN 

SLDPSHGESINVEQGCSLPDPCDSNPCPANSY 

CSNDWDSYSCSCDPGYYGDNCTNVCDLNPC 

EHQSVCTRKPSAPHGYTCECPPNYLGPYCET 

RIDQPCPRGWWGHPTCGPCNCDVSKGFDPDC 

NKTSGECHCKENHYRPPGSPTCLLCDCYPTG 

SLSRVCDPEDGQCPCKPGVIGRQCDRCDNPF 

AEVTTNGCEVNYDSCPRAIEAGIWWPRTRFG 

LPAAAPCPKGSFGTAVRHCDEHRGWLPPNLF 

NCTSITFSELKGFAERLQRNESGLDSGRSQQL 

ALLLRNATQHTAGYFGSDVKVAYQLATRLL 

AHESTQRGFGLSATQDVHFTENLLRVGSALL 

DTANKRHWELIQQTEGGTAWLLQHYEAYAS 

ALAQNMRHTYLSPFTIVTPNIVISWRLDKGN 

FAGAKLPRYEALRGEQPPDLETTVILPESVFR 

ETPPWRPAGPGEAQEPEELARRQRRHPELSQ 

GEAVASV1IYRTLAGLLPHNYDPDKRSLRVPK 

RPIINTPWSISVHDDEELLPRALDKPVTVQFR 

LLETEERTKPICVFWNHSILVSGTGGWSARGC 

EWFRNESHVSCQCNHNfTSFANTMDVSRRE 

NGEILPLKTLTYVALGVTLAALLLTFFFLTLL 

RILRSNQHGIRRNLTAALGLAQLVFLLGINQA 

DLPFACTVIA1LLHFLYLCTFSWALLEALHLY 

RALTEVRDVNTGPMRFYYMLGWGVPAFITG 

LAVGLDPEGYGNPDFCWLSIYDTLIWSFAGP 

VAFAVSMSVFLYILAARASCAAQRQGFEKKG 

PVSGLQPSFAVLLLLSATWLLALLSVNSDTLL 

FHYLFATCNCIQGPFIFLSYWLSKEVRKALK 

LACSRKPSPDPALTTKSTLTSSYNCPSPYADG 

RLYQP\YGDSAGSLHSTSRSGKSQPSYIPFLLR 

EESALNPGXQGPPGLGGIPGR/LCFLGRFKDQQ 

H\DS*TRDFDSDLSLEDDQSGSYASTHSSDSEE 
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D=Aspartic Acid, E=Glutamic Acid, 
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M=Methiontne, N=Asparagine, P=Proline, 
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Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 














EEEEEEEEAAFPGEQGWDSLLGPGAEKLPLHS 
TPKDGGPGPGKAPWPGDFGTTAKESSGNGAP 
EERLRENGDALSREGSLGPLPGSSAQPHKGIL 
KKKCLPT1SEKSSLLRLPLEQCTGSSRGSSASE 
GSRGGPPSRPPPRQSLQEQLNGVMP1AMSIKA 
GTVDEDSSGSEFLFFNFLH 


425 


1775 


A 


3429 


155 


1417 


"GEPAVQSCDCGCTQRSCPWLLVAPGLLS^bb 
RAASVREAEDAPLQPASIHPVSQGSRGPEGSL 
GSAECLPGDPLGARRATRAHSPVPGPPPSLPA 
AGTAVKRGLQPG* GA/GATSTPGTG AATGGL 
CGPAWAAPSAVGPCCCCPSISTTPSQMRSARP 
SLGCLPSWAS\PGTEHPPGPQGPGPS*DLCSV* 
KREFQRGPWAGMVILHRISAADPARAPGPDS 
NLQSALQQPATGCSEPAAVYSPPIGLWGA**P 
EYG*PQHSLPG*TAPADR*P\AGIKDRVYSNSI 
YELLENGQRAGTCVLEYATPLQTLFAMSQYS 
QAGFSREDRLEQAKLFCRTLEDILADAPESQN 
NCRLIAYQEPADDSSFSLSQEVLRHLRQEEKE 
EVTVGSLKTSAVPSTSTMSQEPELLISGMEKP 

LPLRTDFS 


426 


1776 


A 


3431 


1662 


369 


AI W WL S WLQHDLLPTPTQ V AIDFT ASN (iUFK 

SSQSLHCLSPRQPNHYLQALRAVGGICQDYD/ 

SVGESGAGGNRQGGLAQRIPQLFLLPSDKRFP 

AFGFGARIPPNFEVG*MRGKEGDGGRVSQAE 

KAGPHCSRLALTGXSHDFAINFDPENPECEGK 

RGDFHLPRLPADTLHTGAQTPLPRAQLPVPST 

HPRPVFINEISGVIASYRRCLPQIQLYGPTNVAP 

IINRVAEPAQREQSTGQATKYSVLLVLTDGV 

VSDMAETRTAIVRASRLPMSIIIVGVGNADFS 

DMRLLDGDDGPLRCPRGVPAARDIVQFVPFR 

DFKD V SPPGPFRLKDS S ASHPPKSDLRLP PFD 

VLLRTREP S WPP * SPTSPSDDPASPTLPLTPNHI 

TVPTLVAAPSALAKCVLAEVPRQWEYYASQ 

GISPGAPRPCTLATTPSPSP 


427 


1777 


A 


3446 


. 79 


9748 

j 

i 

1 


' " GCQSCWPA WPRLRRRGPASAGARLCiKKA^ w 
GLPGRVQDGRPLRFCFYLRPRAPFIAPVLSGA 
ASRPEASGDCRAGRETAMATLEKLMKAFESL 

KSFQQQQQQQQQQQQQQQQQQQQQQQ PPPP 

PPPPPPPQLPQPPPQAQPLLPQPQPPPPPPPPPP 

GPAVAEEPLHRPKKELSATKKDRVNHCLTIC 

ENIVAQSVRNSPEFQKLLG1AMELFLLCSDDA 

ESDVRMVADECLNKVIKALMDSNLPRLQLEL 

YKE1KKNGAPRSLRAALWRFAELAHLVRPQK 

CRPYLVNLLPCLTRTSKRPEESVQETLAAAVP 

KIMASFGNF ANDNEIKVLLKAF1 ANLKS SSPTI 

RRTAAGSAVSICQHSRRTQYFYSWLLNVLLG 

LLVPVEDEHSTLLELGVLLTLRYLVPLLQQQV 

KDTSLKGSFGVTRJCEMEVSPSAEQLVQVYEL 

TLHHTQHQDHNWTGALELLQQLFRTPPPEL 

LQTLTAVGGIGQLTAAKEESGGRSRSGSIVELI 

AGGGSSCSPVLSRKQKGKVLLGEEEALEDDS 

ESRSDVSSSALTASVKDEISGELAASSGVSTPG 

SAGHDIITEQPRSQHTLQADSVDLASCDLTSS 

ATDGDEEDILSHSSSQVSAVPSDPAMDLNDG 

TQASSPISDSSQTTTEGPDSAVTPSDSSEIVLD 

GTDNQYLGLQIGQPQDEDEEATGILPDEASEA 

FRNSSMALQQAHLLKNMSHCRQPSDSSVDKF 

VLRDEATEPGDQENKPCRIKGDIGQSTDDDS 

APLVHCVRLLSA5FLLTGGKNVLVPDRDVRV 

SVKALALSCVGAAVALHPESFFSKLYKVPLD 
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D-Aspartic Acid, E=Glutamic Acid, 
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/=possib!e nucleotide deletion, \=possible 
nucleotide insertion 














TTEYPEEQYVSDILN V 1DHGDPQVRGA 1 A11X 

GTLICS1LSRSRFHVGDWMGTIRTLTGNTFSL 

ADCIPLLRKTLKDESSVTCKLACTAVRNCVM 

SLCSSSYSELGLQLODVLTLRNSSYWLVRTEL 

LETLAEIDFRLVSFLEAKAENLHRGAHHYTGL 

LKLQERVLNNWIHLLGDEDPRVRHVAAASL 

IRLVPKLFYKCDQGQADPVVAVARDQSSVYL 

KLLMHETQPP SHFS V STITRTYRG YNLLPS ITD 

VTMENNLSRVIAAVSHELITSTTRALTFGCCE 

ALCLLSTAFPVCIWSLGWHCGVPPLSASDESR 

KSCTVGMATMDLTLLSSAWPLDLSAHQDAL 

ILAGNLLAASAPKSLRSSWASEEEANPAATK 

OEEVWPALGDRALVPMVEQLFSHLLKVINIC 

AHVLDDVAPGPADCAALPSLTNPPSLSPIRRK 

GKEKEPGEQASVPLSPKKGSEASAASRQSDTS 

GPVTTSKSSSLGSFYHLPSYLKLHDVLKATHA 

NYKVTLDLQNSTEKFGGFLRSALDVLSQILEL 

ATLQDJGKCVEEILGyLKSCFSREPMMATVC 

VQQLLKTLFGTNLASQFDGLSSNPSKSQGRA 

QRLGSSSVRPGLYHYCFMAPYTHFTQALADA 

SLRNMVQAEQENDTSGWFDVLQKVSTQLKT 

NLTSVTKNRADKNAIHNHIRLFEPLVIKALKQ 

YTTTTCVQLQKQVLDLLAQLVQLRVNYCLL 

DSDQVFIGFVLKQFEYTEVGQFRESEAI1PNIFF 

FLVLLSYERYHSKQIIGIPKJIQLCDGIMASGR 

KAVTHAIPALQPIVHDLFVLRGTNKADAGtCE 

LETQKEVWSMLLRLIQYHQVLEMFILVLQQ 

CHKENEDKWKRLSRQIADIILPMLAKQQMHI 

DSHEALGVLNTLFEILAPSSLRPVDMLLRSMF 

VTPNTMASVSTVQLW1SGILAILRVLISQSTED 

IVLSRIQELSFSPYLISCTVINRLRDGDSTSTLE 

EHSEGKQIKNLPEETFSRFLLQLVGILLEDIVT 

KQLKVEMSEQQHTFYCQELGTLLMCLIHIFKS 

GMFRRTTAAATRLFRSDGCGGSFYTLDSLNLR 

ARSMITTHPALVLLWCQILLLVNHTDYRWW 

AEVQQTPKRHSLSSTKLLSPQMSGEEEDSDLA 

AKLGMCNREIVRRGALILFCDYVCQNLHDSE 

HLTWLIVNHIQDLISLSHEPPVQDFISAVHRNS 

AASGLFIQA1QSRCENLSTPTMLKKTLQCLEGI 

HLSQSGAVLTLYVDRLLCTPFRVLARMVDIL 

ACRRVEMLLAANLQSSMAQLPMEELNRIQEY 

LQSSGLAQRHQRLYSLLDRFRLSTMQDSLSPS 

PPVSSHPLDGDGHVSLETVSPDKDWYVHLVK 

SQCWTRSDSALLEGAELVNRIPAEDMNAFM 

MNSEFNLSLLAPCLSLGMSEISGGQKSALFEA 

AREVTLARVSGTVQQLPAVHHVFQPELPAEP 

AAYWSKLNDLFGDAALYQSLPTLARALAQY 

LVWSKLPSHLHLPPEKEKDIVKPWATLEAL 

SWHLIHEQIPLSLDLQAGLDCCCLALQLPGL 

WSWSSTEFVTHACSLIYCVHFILEAVAVQPG 

EQLLSPERRTNTPKAI SEEEEE VDPNTQNPK Yl 

TAACEMVAEMVESLQSVLALGHKRNSGVPA 

FLTPLLRNrilSLARLPLVNSYTRVPPLVWKLG 

WSPKPGGDFGTAFPEIPVEFLQEKEVFKEFIYR 

INTLGWTSRTQFEETWATLLGVLVTQPLVME 

QEESPPEEDTERTQINVLAVQAITSLVLSAMT 

VPVAGNPAVSCLEQQPRNKPLKALDTRFGRK 

LSIIRGIVEQE1QAMVSKRENIATHHLYQAWD 

P VPSLSPATTGALI SHEKLLLQINPERELGSMS 

YKLGQVS1HSVWLGNSITPLREEEWDEEEEEE 
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D=Aspartic Acid T E=Glutamic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
I=lsoleucine, K=Lysine, L-Leucine, 
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Q=Glutamine, R-Argininc, S-Serine, 
T-Threonine, V=VaJine, W=Tryptophan, 
Y^Tyrosine, X=Unknowa *=Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 














ADAPAPSSPPTSPVNSRKHRAGVDIHSCbQhJL 

LEL YSRWILPSS S ARRTP AILISE WRSLL WS 

DLFTERNQFELMYVTLTELRRVHPSEDEILAQ 

YLVPATCKAAAVLGMDKAVAEPVSRLLESTL 

RSSHLPSRVGALHGILYVLECDLLDDTAKQLI 

PVI SD YLLSNLKGI AHCVN1HSQQHVLVMC AT 

AFYLIENYPLDVGPEFSASIIQMCGVMLSGSE 

ESTPSIIYHCALRGLERLLLSEQLSRLDAESLV 

K1SVDRVNVHSPHRAN1AALGIJVILTCMYTG 

KEKVSPGRTSDPNPAAPDSESVTVAMERVSVL 

FDR1RKGFPCEARWARILPQFLDDFFPPQDIM 

r^VIGEFLSNQQPYPQFMATVVYKVFQTLHS 

TGQSSMVRDWVMLSLSNFTQRAPVAMATWS 

LSCFFVSASTSPWVAAILPHV1SRMGKLEQVD 

WLFCLVATDFYRHQ1EEELDRRAFQSVLEV 

VAAPGSPYHRLLTCLRNVHKVTTC 


428 


1778 


A I 


3449 


3 


430 


NSRPSPSAALVEVLLRSGSTFPHTVSGGWAA 
WGPWSSCSRDCELGFRVRKRTCTNPEPRNGG 
LPCVG D AAE YQDCNPQACPVRGA WSC WTS 
WSPCSASCGGGHYQRTRSCTSPAPSPGEDICL 
GLHTEEALCATQACPEGWS 


429 


1779 


A 


3464 


583 


3 


DALDRRYLERCHPAAGGWVGEGE*ALCQKT/ 

RFSGVLEPPLPSLKDGGRFPAWT*RSCSKSLR 

AAFTSQFFPSRRSRASPGSAP\GNGQNLTEQHP 

CPGSCDPQVLSASWM*VEHRSKFRPPP*NSTI 

PPES/RS* QGGTVQTGQHSSGREAGS WRARGR 

NAGRR*KGGG1CIGTKQGAVRARKECRGEMA 

SGETDSE 


430 

{ 


1780 


A 

i 


3473 


2802 


270 

i 

1 


FRMR1FLHCPWNQQMWKIWNLLETSLESCKA 

HLSIQKLLKER\Q\QLPVFKHRDSIVETLKRHR 

VVWAGE1AGSGKSTQVPHFLLEDLLLNEWE 

ASKCNIVCTQPRRISAVSLANRVCDELGCENG 

PGGRNSLCGYQIRMESRACESTRLLYCTTGV 

LLRKXQEDGIXSNVS/HMFIVDEV\HER\SVQS 

DFLLIILKE1LQKRSDLHLILMSATVDSEKFST 

YFTHCPILRISGRSYPVEVFHLEDIIEETGFVLE 

KDSEYCQKFLEEEEEVTINVTSKAGGIKKYQE 

YIPVQTGAHADLNPFYQKYSSRTQHAILYMN 

PHKJNLDLILELLAYLDKSPQFRNIEGAVLIFL 

PGLAHIQQLYDLLSNDRRFYSERYKVIALHSI 

LSTQDQAAACTLPPPGVRKIVLATNIAETGm 

PD WF VIDTGRTKENK YHES S QMSSL VETFVS 

KASALQRQGRAGRVRDGFCFRMYTRERFEG 

FMDYSVPEILRVPLEELCLHIMKCNLGSPEDF 

LSKALDPPQLQVISNAMNLLRKIGACELNEPK 

LTPLGQHLAALPVNVKIGKMLIFGAIFGCLDP 

VATLAAVMTEKSPFTTPIGRKDEADLAKSAL 

AMADSDHLTrYNAYLGWKKARQEGGYRSEI 

TYCRRNFLNRTSLLTLtDVK^r^u^ 

SSSTTSTSWEGNRASQTLSFQEIALLKAVLVA 

G L Y DNVGKn YTKS VD VTEKL ACIVETAQGK. 

AQVHPSSVNRDLQTHGWLLYQEKIRYARVY 

LRETTLITPFPVLLFGGDIEVQHRERLLSIDGW 

IYFQAPVKIAVIFKQLRVLIDSVLRKKLENPK 

MSLENDKILQITTELIKTENN 


431 


1781 


A 


3474 


1 


441 


FRP APGH VQP* GGS S AAAGGGLLSHPRPCQQ 

PCPPAPAPSRPRSLGSLGQRVPAALATAAQEL 

PATLGGDGGKPALTAGEAALPGLHRSGVPAA 

AARC*PCT/SRPT*STLSPTQAAWWCRPSRRQ 

QRGEASTGGASGRRCGSCFQV 



183 



WO 01/57188 



PCT/US01/03800 



SEQID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
^sequence 


Amino acid sequence (A= Alanine C=Cysteine, 

D=Aspartic Acid, E=Glutamic Acid, 

F=Phenylalanine, OGlycine, H=Histidine, 

i Knidiminp XT— T vein** T — 1 purine 

1—Lso leucine, jv— -Lysine, l-— i_,cuv,m^, 

M=Methionine, N=Asparagine, P=Proline, 

QKjlutamine, R=Arginine, S=Serine, 

T=Threonine, V^Valinc, W=Tryptophan, 

Y=Tyrosine, X=Unknown, *=Stop codon, 

h~ r>r*cci>-i1p nnH^ntide deletion. \=DOSSible 

nucleotide insertion 


432 


1782 


A 


3478 


416 


23 


QLRRLTLPNFKT Y/Y SS * I1E1A W H * * KNMQED 
Q WFRRESPEIDLCKY S * LSFDKEAKAIK/WKE 
CSLFNKWC/YKNWM/LHVQKKRI * VQTLHPS 
QKLKXSKWKDLNVECRITKIXDQEYPGDLGY 
SRALNSGSR 


433 


1783 


A 


3504 


1876 


552 


CLAPCSPQPEKNGMQPLLLLLPPLLYQQLLH5 

SLGAPGESTLLVRTSKLLVGLGLQLLVWLLL 

QTRSLLALQLHLTSSAPLLAAPTAVCSCSRCS 

APRSRCVARPAARTGLPTPAPASSPAP AASPA 

PAASPAPAESTA\PQPLILLPKP/PPAPGAPPPRP 

GAPPPRPAASPSPAASPAPPAASPVLTASPPLP 

AASPSPAASPAPPAASPVLTASPPLPAASPSPA 

. . —-^ • icmn ta cDUI DA A CPA T A ACPVTTT 

ASPAPPAASPVL I AirrLrAAorALAAir v n i 

ASPPVHVASPPVHTASPPVHVASPPVHTASPP 

VHVASPPVHTASPHVHVASPPVHVASPPVHV 

ASPPVHTASPPVHVASPPVHTASPHVHVASPP 

VHTASPPVHVASPPVHVASPPVHVAYPPVHV 

ASPPVHVASPPVH V Aor r V bCoOL'o l ou\^r rr 

OPGAVFPHSLAPSLGGWSHLVAALP 


434 


1784 


A 


3516 


142 


590 


GGVNRPRSETEQVKTPVLISSWDYRHPPPKl^A 
SFFVFLV^TGRTALARMVLISWPCDLPTSASQ 
SAGITGVRHHA\RLLYFEQESHSVTQAGW\VQ 
WHNLGSLQPLSLEDRLSPGVLGCSALCRSGV 
RTKFGINMVTSRERGTTRLPKEG 


435 


1785 


A 


3529 


1 


3161 


MSLVRAALEALDELDLFGVKGGPQSV1HVLA 

DEV QHCQS ILN SLLPRA STSKEVD ASLLS WS 

FPAFAVEDSQLVELTKQEimCLQGRYGCCRF 

LRDGYKTPKEDPNRLYY/ENPAELKLFENIEC 

EWPLFWTYFILDGVFSGNAEQVQEYKEALEA 

VLKGKNGVPLLPELYSVPPDRVDEEYQNPHT 

VDRVPMGKLPHMWGQSLYILGSLMAEGFLA 

PGEIDPLNRRFSTVPKPDVWQVYPSLPHGCS 

SKSPSHQCTnSIRTTRKTTAPVSILAETEEIKTIL 

KDKGI YVETIAEV YPIRVQPARILSHI YS SLEIF 

LPFLNSVSGCNNRMKLSGRPYRHMGVLGTSK 

LYDIRKTIFTFTPQFIDQQQFYLALDNKMIVE 

MLRTDLSYLCSRWRMTGQPTITFPISHSMLDE 

DGTSLNSSILAALRKMQDGYFGGARVQTGKL 

SEFLTTSCCTHLSFMDPGPEGKLYSEDYDDN 

YDYLESGNVmNDYDSTSHARCGDEVARYL 

DHLLAHTAPHPKLAPTSQKGGLDRFQAAVQT 

TCDLMSLVTKAKELHVQNVHMYLPTKLFQA 

SRPSFNLLDSPHPRQENQVPSVRVEIHLPRDQ 

SGEVDFKALVLQLKETSSLQEQADILYMLYT 

MKGPDWNTELYNERSATVRELLTELYGKVG 

EIRHWGLIRY1SGILRKXVEALDEACTDLLSH 

QKHLTVGLPPEPREKTISAPLPYEALTQLIDEA 

SEGDMSISILTQEIMVYLAMYMRTQPGLFAE 

MFRLRIGLilQVMATELAHSLRCSAEEATEGL 

MNLSPSAMKNLLHHILSGKEFGVERSVRPTD 

SNVSPAISIHE1GAVGATKTERTGIMQL1CSEIK 

r\ c dt.t<; \ATP<z 9 A YDOOSSKD SRQGQ W 

QRRRRLDGALNRVPVGFYQKVWKVLQKCH 

GLSVEGFVLPSSTTRENfTPGEIKFSVHVESVL 

NR\TQPEYRQLLVEAIL\VLTMLADIEI\HSIGS 

IIAV^KIVHLANDLFLQEQKTLGADDTMLAKD 

PASGICTLLYDSAPSGRFGTN1TYLSKAAATY 

VQEFLPHSICAMQ 


436 


1786 ! A 

i 


3546 


73 


393 


" CP* LTWELLEVKKAEVLQDSLDGRYSTPSSCL 
EQPDSCRPYGRSFYALEEKHVIFSLDVGETDN 
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T-Threonine, V^Valine, W-Tryptophan, 
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/=possible nucleotide deletion, \=possible 
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KGKGKTIRGI* TFKGRKGGT Y QREHDANPLA 
PXSARSCWMRKG 


1 437 


1787 


A 


3554 


5157 


2939 


AVRAEPGLEELSSGLRAHSPSATTVCEPEAQO 

SASGCRYAAHPHWGLGGAAAAGGSWEPQPP 

RPVCEPAGRGKPHPPAAPRSPLLPGSRRRPHA 

AQPGARARTSPPPASARNMAARPAATLAWSL 

LLLSSALLREGCRARFVAERDSEDDGEEPWF 

PESPLQSPTVLVAVLARNAAHTLPHFLGCLER 

LDYPKSRMAIWAATDHNVDNTTEIFREWLK 

NVQRLYHYVEWRPMDEPESYPDE1GPKHWP 

TSRPAHVMKLRQAALRTAREKWSDY1LFIDV 

DNFLTNTQTLNLL1AENKTP/APMLESRGLYS 

NFWCGITPKGFYKRTPDYXVQIREWKRTGCFP 

VPMVH STFLIDLRKEASDKLTF YPPHQD YTW 

TFDDIIVFAFSSRQAGIQMYLCKREHYGYLPIP 

LKPHQTLQEDIENLIHVQIEAMIDRPPMEPSQ 

YV S VWKYPDKMGFDEIFMINLKRRKG QGGD 

RWLRTLYEQEIEVKJVEAVDGKALKTSQLKA 

LN1EMLPGYRDPYSSRPLTRGEIGCFLSHYSV 

WKEVn)RELEKTLVIEDDVRFEHQFKJOU,MK 

LMDNIDQAQLDWELIYIGRKRMQVKEPEKA 

VPNVANLVEADYSYWTLGYVISLEGAQKLV 

G ANPFGKMLP VDEFLP VMYNKHP V AE YKE Y 

YESRDLKAFSAEPLLIYPTHYTGQPGYLSDTE 

rmrriwmt-KTCTM a rnu/nn th A WK RRKOSRIYSN 

AKNTEALPPPTSLDTVPSRDEL 


438 


1788 


A 


3563 


130 


527 


IFFN S SSLFCRVFCLFLRWSFTL V AQ AR V Q * L 
NLSSLQPLPPGFK*FSCLSPPRS*DYRRPPPRPA 
NFLYF* ♦RQGFTVLGQAGLELLT/S/GDPPTSA 
SQSAGITGVSHRAWPVHA1STHISLVKTRPSLT 


439 


1789 


A 


3565 


446 


1834 


" LLQPAMRKSPGLSDCLWAW1LLLS1 LI ORS V 
GOPSLQDELKDNTTVFTRILDRLLDGYDNRL 
RPGLGERVTEVKTDIFVTSFGPVSDHDMEYTI 
DWFKQSWKDERLKFKGPMTVLRLNNLMAS 
KIWTPDTFFHNGKKSVAHNMTMPNKLLRITE 
DGTLLYTMRLTVR\AECPMAFGRDr^MU)\AH 
ACTLKFGSYAYTRAEVVYEWTREPARSVVV 
AEDGSRLNQYDLLGQTVDSGIVQSSTGEYW 
MTTHFHLKRKiO Yr Vl^J l i LrL,uvi i v u-ov v ji 
WLNRESWARTVTGVTTVLTMTTLSISARNSL 
PKVA YATAMD V/FIAVCYAF VFS ALIEFAT VN 
YTTKRGYAWDGKSVWEKPKlCvTCDPLIKKN 
vt^adtatcvtpmt APGHPGI ATIAKSATIEP 
KEVKPETKPPEPKKTFNSVSKIDRLSR1AFPLL 
FGIFNLVYWATYLNREPQLKAPTPHQ 


440 


1790 


A 


3568 


1 


350 


' " STSSCFPAAAAAIMREI VHLQAGQCGNQlGAK 
FWEVISDEHGIDPTGTYHGDSDLQLERINVYY 
NEATGE AP VPSPTALRGPRGPCLG* RPP VPAG 
GKYVPRAVLVDMEPGTMDSV 


441 


1791 


A 


3569 


2 


1751 

i 

i 


" FVAVAGAVSGEPLVHWCTQQLRKTFULUV5 
EEDQYVLSIESAEEIREYVrDLLQGNEGKKGQ 
FIEEL1TKWQKNDQELISDPLQQCFKKDEILDG 
QKSGDHLKRGRKKGRNRQEVPAFTEPDTTAE 
VKTPFDLAKAQENSNSVKKKTKFWLYTREG 
QDRLA\XLPGRHPCDCLGQKHKLINNCLICG 
RIVCEQEGSGPCLFCGTLVCTHEEQDILRGDS 
N^CSQKLLKKLMSGVENSGKVDISTKDLLPH 
QELRIKSGLEKAIKHKDKLLEFDRTSIRRTQVI 
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SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID ; ! 
NO: of 1 
peptide 
seq- 
uence 


Viet 
iod 


SEQ 

[DNO: 

in 

USSN 
09/496 
914 


Predicted 

beginning 

jucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A-Alanine OCystcinc, 
D=Aspartic Acid, E=Glutamic Acid, 
F-Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T-Threoninc, V=Valine, W-Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 














DDESDYFASDSNQWLSKLERETLQKREEELK 

ELRHASRLSKXVTIDFAGRKILEEENSLAEYH 

SRLDETIQAIANGTLNQPLTKLDRSSEEPLGVL 

VNPNMYQSPTOWVDHTGAASQKKAFRSSGF 

GLEFNSFQHQLRIQDQEFQEGFDGGWCLSVH 

QPWASLLVRGIKRVEGRSWYTPHRGRLWIAA 

TAKKPSPQEVSELQATYRLLRGKDVEFPKDY 

PSGCLLGCVDLIDCLSQKQFKEQFPDISQESDS 

PFWICKNPQEMVVKFPDCGNPKIWKLDSKIH 

OGAKKGLMKQNKAV 


442 


1792 


A 


3576 


1 


2019 


MPRSHTGERLCEGKEGSQCAENFSPNLS V IK 

KTAGVKPYECTICGKAFMRLSSLTRHMRSHT 

AIRAI\EKPYKCKJEC\GRAFSLSQILSK\HERSH 

TG EKP Y KCKQCGKTFI YHQPFQRHERTHIGEK 

PYECKQCGKALSCSSSLRVHERIHTGEKPYEC 

KQCGKAFSCS S S [RVHERTHTGEKP Y ACKXEC 

GKAFISXTTSVLTHMITHNGDRPYKCKECXjKA 

FIFPSFLRVHERIHTGEKPYKCKQCGKAFRWS 

TSIQIHERIHTGEKPYKCKECGKSFSARP AFRV 

HVRVHTGEKPYKCKECGKAFSRISYFRIHERT 

HTGEKPYECKKCGKTFNYPLDLKIHKRNHTG 

EKPYECKECAKTFISLENFRRHMITHTGDGPY 

KCRDCGKVFIEPSALRTHERTHTGEKPYECKQ 

CGKAFSCSSYIRIHKRTHTGEK\PYECKECGK 

AFIYPTSFQGHMRMHTGEKPYKCKECGKAFS 

LHSSFRNRFTTRIHNYEKPLEC* QVCGKAFS VSTS 

LKXPMRNAQSDRKLY/KCEK*EKVFNSNRCF 

QS CENSH* REKSCQCK* YRKRDTR* FMYSQ V 

PHNHVSVSNGPYRyCGSPIRLYNT*NISrNRNL 

VAVVTP*CSTLFKCLWCWCKRAALSVV*/IVQ 

DSGRGRWLTPVIPALWEAKAGGSRGQEIKTIL 

ANTVKPHLY 


443 


1793 


A 


3578 


287 


114 


DF YERKFEQFIEGHKQIVNKWRDLLCS WKKA 
LSHKKSVLQNNL*FSAASMRFQKVFF 


AAA 


1794 


A 


3582 


3335 


1909 


■ ^FFSLJLAAMAMTGSrPCSSMSNH'rKbKVl 
MTKVTLENFYSNLIAQHEEREMRQKKLEKV 
MEEEGLKDEEKRLRRSAHARKETEFLRLKRT 
RL GLED FESLK VIGRG AFGE VRL VQKKDTGH 
VYAMKDLRKADMLEKEQVGHIRAERDILVEA 
DSLWVVKMFYSFQDKLNLYLIMEFLPGGDM 
MTLLMKKDTLTEEETQFY1AETVLAIDSIHQL 
GFIHRDIKPDNLLLDSKGHVKLSDFGLCTGLK 
KAHRTEFYRNLNHSLPSDITFQNMNSKRKAE 
TWKRNRRQLAFSTVGTPDYIAPEVFMQTGYN 
j KXCDWWSLGV1MYEMLIGYPPFCSETPQETY 
KXV^MNWKETLTFPPEVPISEKAKDLILRFCCE 
! WEHR1GAPGVEEIKSNSFFEGVDWEHIRERPA 
1 AISIEDCSIDDTSNFDEFPESDILKPTVATSNHPE 
; TDYKNKDVV^INYTYKRFEGLTARGAIPSYM 
! KAAK 


: 445 

! 
i 


1795 


A 


3584 


1 


" 6169 


! RTRGIEKRFAYSFLQQLiRYVDEAHQYULbFU 
* GGSRGKGEHFPVTQEKFFAKVVLPLIDQYFK 
: NHRLYFLSAASRPLCSGGHASNKEKEMVTSL 
rri /i pin \ rp xto t c T FOMD A TSTVNCLH1LGOT 
LDARTVMKTGLESVKSALRAFLDNAAEDLE 
KTMENLKQGQFTHTRNQPKGVTQIINYTTVA 
LLPMLSSLFEHIGQHQFGEDLILEDVQVSCYR1 
LTSLYALGTSKSIYVERQRSALGECLAAFAGA 
FPVAJFLETHLDKHNTYSIYNTKSSRERAALSLP 
TNVEDVCPNIPSLEKLMEEIVELAESGIRYTQ 



186 



WO 01/57188 



PCTYUSO 1/03800 



SEQID : 
NO: of 1 
nucl- 1 
eotide 
seq- 
uence 


SEQ ID I 
MO: of 
peptide 
seq- 
uence 


Met J 
lod 


SEQ i 
lDNO: 

n 

u 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspond! 

ngto first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanmc OCysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, G=Glycme, H=Histidine, 
l=Isoleucine, K-Lysine, L«Leucine, 
M=Methioninc, N=Asparagine, P^Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threontne, V=Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, V=possible 
nucleotide insertion 














MPHVMEVILPMLCSYMSRWWEHGPENN^bK 

AEMCCTALNSEHMNTLLGNILKIIYNNLGIDE 

GAWMKJR1AVFSQPIINKVKPQLLKTHPLPLM 

EKLKKKAATVVSEEDHLKAEARGDMSEAEL 

L1LDEFTTLARDLYAFYPLLIRFGDYNRAKWL 

K^NPEAEELFRMVAEVFIYWSKSHNFKREE 

QNFWQNEINNMSFLITDTKSKMSKAAVSDQ 

ERKKMKRKGDRYSMQTSLIVAALKRLLPIGL 

NI C APGDQELIAL AKNRFSLKDTEDE VRDIIRS 

NIHLQGKLEDPAIRWQMA1YKDLPNRTDDTS 

DPEKTVERVLD1ANVLFHLEQKSKRVGRRHY 

CLVEHPQRSKKAVWHKLLSKQRKRAWACF 

RMAPLYKLPRHRAVNLFLQGYEKSWIETEEH 

YFEDKLIEDLAKPGAEPPEEDEGTKRVDPLHQ 

LILLFSRTALTEKCKLEEDFLYMAYADIMAKS 

CHDEEDDDGEEEVKSFEEKEMEKQKLLYQQ 

ARLHDRGAAEMVLQTISASKGETGPMVAAT 

LKLGIAILNGGNSTVQQKMLDYLKEKKDVGF 

FQSLAGLMQSCSVLDLNAFERQNKAEGLGM 

VTEEGSGEKVLQDDEFTCDLFRFLQLLCEGH 

NSDFQNYLRTQTGNNTTVNIILSTVDYLLRVQ 

ESISDFYWYYSGKDVIDEQGQRNFSKAIQVA 

KQVFNTLTEYIQGPCTGNQQSLAHSRLWDAV 

VGFLHVFAHMQMKLSQDSSQIELLKELMDLQ 

KJDMVVMLLSMLEGNVVNGTIGKQMVDMLV 

ESSNNVEM1LKFFDMFLKLKJDLTSSDTFKEYD 

PDGKGVIFKRDFHKAMESHKHYTQSETEFLL 

SCAETDEKETLDYEEFVKRFHEPAKDIGFNVA 

VLLTNLSEHMPNDTRLQTFLELAESVLNYEQP 

FLGRIEIMGSAKRIERVYFEISESSRTQWEKPQ 

VKESKRQFIFDWNEGGEKEKMELFVNFCED 

TIFEMQLAAQ1SESDLNERSANKEESEKERPEE 

QGPRMAFFSILTVRSALFALRYNILTLMRMLS 

LK^LKKQMKKVKKMTVKDMVTAFFSSYWSI 

FMTLLHFVASVFRGFFRIICSLLLGGSLVEGA 

KKIKVAELLANMPDPTQDEVRGDGEEGERKP 

LEAALPSEDLTDLKELTEESDLLSDIFGLDLKR 

EGGQYKLIPHNPNAGLSDLMSNPVPMPEVQE 

KFQEQKAKEEEKEEKJEETKSEPEKAEGEDGE 

KEEKAKEDKGKQKLRQLHTHRYGEPEVPESA 

FV^QIAYQQKLLNYFARNFYNMRMLALFV 

AFAINFILLFYKVSTSSVVEGKELPTRSSSENA 

KVTSLDSSSHRIIAVrTYVLEESSGYMEPTVRlL 

PILHTVISFFCI1GYYCLKVPLVIFKREKEVARK 

LEFDGLYITEQPSEDDDCGQWDRLVTNTQSFP 

NNYWDKFVKRKVMDKYGEFYGRDRISELLG 

MDKAALDFSDAREKJCKPKKDSSLSAVLNSID 

VKYQMWKLGVVrTDNSFLYLAWYMTMSVL 

GHYXKNFFFAAHLLDIAMGFKTLRTTLSSVTH 

NGKQLVLTVGLLAVVVYLYTWAFNFFRKF 

YNKSEDGDTPDMKCDDMLTCYMFHMYVGV 

RAGGGIGDEIEDPAGDEYEIYRnFDITFFFFVI 

V1LLAIIQGLIIDAFGELRDQQEQVKEDMETKC 

Firr,TrTNmYFDTVPHGFETrVnvOEHNLANYLF 

FLMYLrNKDETEHTGQESYVWKMYQERCWE 

FFPAGDCFRXQYEDQLN 


446 


1796 


A 


3592 


1 


355 


" AGLELLNSDDPPALASQSAGITGVTR 1 FbLhh * 
DTVLLCCSGWSAVAPSRLTAALFS*AQAVCL 
SLPRSWDYRRW>TPHPANFCIFCRDE^LA/ML 
PRLVSNS^TQAILLPRPPKMLGLQV 
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SEQ ID J 
NO: of ) 
nucl- ] 
eotide ; 
seq- i 
uence 


3EQ ID } 
MO: of 1 

>eq- 
jeocc 


vlct i 
lod ] 


>EQ 1 

DNO: 

n 

JSSN 
39/496 
914 


Predicted 1 
ocg inning 
nucleotide 
ocation 
correspond! 
ng to first 
amino acid 
residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A»Alanine OCysteinc, 
D=Aspartic Acid, E=01utamic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
l=Isoieucine, K-Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q-Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W-Tryptophan, 
Y=Tyrosine, X=Unlcnown, ♦-Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 


447 


1797 


A 


3598 


1202 


1070 


LFVGGGPlCPEGASGFAPGPAPAPRVGVDAtv 

GR*V*GAAASQGA/GSLRPRPTGPGHPGAWL 

QVWGAAAVCAGPAM*/AVRAKRGPRAG*EP 

NSPWSGVLAA\RAVGAGPWP*P*PGCS*ARG 

PSSRSAPGLASGPAAPLLQGVHSSAGPLLCYI 

MGTLALGLKP* * AWGWGEWRPKG 


448 


1798 


A 


3604 


3115 


557 


FRRKGGGGPKDFGAGLKYNSRHEKVNGLbb 

GVEFLPVNNVKKX^KHGPGRWWLAAVLIG 

LLLVLLGIGFLVWHLQYRDVRVQKVFNGYM 

RITNENFVDAYENSNSTEFVSLASKVKDALICL 

LYSGVPFLGPYHKESAVTAFSEGSV1AYYWSE 

FSIPQHLVEEAERVMAEERWMLPPRARSLKS 

FWTSWAFPTDSKTVQRTQDNSCSFGLHAR 

GVELMRPTTPGFPDSPYPAHARCQWALRGD 

ADSVLSLTFRSFDLASCDERGRHLV\TVYNTVL 

SPMEPHA\LVQLCGTYPPSYNLTFHS\S\QNVL 

LITLITNTERRHPG\FEATFFQLPRMSSCGGRL 

RKAQGTFNSPYYPGHYPPNIDCTWNIEVPNN 

QHVKVRFKFFYLLEPGVPAGTCPKDYVE1NG 

EKYCGERSQFWTSN SNK1TVRFHSDQS YTDT 

GFLAEYLSYDSSDPCPGQFTCRTGRCIRKELR 

CDGWADCTDHSDELNCSCDAGHQFTCKNKF 

CKPLFWVCDSLNDCGDNSDEQGCSCPVAQTF 

RCSNGKCLSKSQQCNGKDDCGDGSDEASCP 

KVNVXO-CTKHTYRCLNGLCLSKGNPECDGK 

EDCSDGSDEKDCDCGLRSFTRQARWGGTD 

ADEGEWPWQVSLHALGQGHICGASL1SPNWL 

VSAAHCYIDDRGFRYSDPTQWTAFLGLHDQS 

QRSAPGVQERRLKRIISHPFFNDFTFDYDIALL 

ELEKPAEYSSMVRP1CLPDASHVFPAGKAIWV 

TGWGHTQYGGTGALILQKGE1RVINQTTCEN 

LLPQQITPRMMCVGFLSGGVDSCQGDSGGPL 

SSVEADGRIFQAGWSWGDGCAQRNKPGVY 

TRLPLFRDWIKENTGV 


449 


1799 


A 


3618 


2 


613 


■ — pvSGSPWRMDGSTERLEAKJ&PAGRLPW SSK^ 
EMTRRPSLMAGRQHGWSAQQSATVANPVPG 
ANPDLLPHFLGEPEDVYIVKNKPVLLVCKAV 
PATQIFFICCNGEWVRQVDHVIERSTDGSSGLP 
TMEVRINVSRQQVEKVFGLEEYWCQCVAWS 
SSGTTKSQKAYIR1AYLRKNFEQEPLAKEVSL 
EOGIVLPCRPPEGIPPAE 


450 


1800 


A 


3620 


1 


2676 


MEPSLGQGMDLTCPFG VSPACGAQAb W bibu 

ADAAEVPGTRGHSQQEAAMPHIPEDEEPPGE 

PQAAQSPAGQQGPPTAGVSCSPTPTIVLTGDA 

TSPEGETDKNLANRVHSPHKRLSHRHLKVST 

ASLTSVDPAGH1TDLVNDQLPDISISEEDKKKN 

LALLEEAKLVSERFLTRRGRKSRSSPGDSPSA 

VSPNLSPSASPTSSRSNSLTVPTPPEGDEADVS 

SPHPGEPNVPKGLADRKQNDQRKVSQGRLAP 

RPPP\TKSKE1A1EQKENFDPLQYPETTPKGLA 

PVTNSSGKMALNSPQPGPVESELGKQLLKTG 

WEGSPLPRSPTQDAAGVGPPASQGRGPAGEP 

MGPE AGSKAELPPTV SRPPLLRGLS WDSGPEE 

PGPRLQKVLAKLPLAEEEKRFAGKAGGKLAK 

APGLKDFQIQVQPVRMQKLTKLREEHILMRN 

QNLVGLKLPDLSEAAEQEKGLPSELSPAIEEE 

ESKSGLDVMPNISDVLLRKLRVHRSLPGSAPP 

LTEKEVENVFVQLSSAFRNDSYTLESRINQAE 

RERNLTEENTEKELENFKASITSSASLWHHCE 

HRETYQKLLED1AVLHRLAARLSSRAEWGA 
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SEQ ID e 
NO: of f 
nucl~ ] 
eotide £ 
seq- i 
uence 


;eq id ? 

40: of r 
>eptide 
►eq- 
jencc 


Act 5 
lod I 
i 
1 
( 


>EQ I 
DNO: t 
n i 
JSSN 1 
)9/496 
?14 


Predicted 
beginning 
lucleotide 
ocation 
^orrespondi 
rig to first 
amino acid 
residue of • 
peptide 
sequence j 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


^ino acid sequence (A=Alamne C=Cystemc, 
[>=Aspartic Acid, E=Glutamic Acid, 
^Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, ^-Leucine, 
M=Methionine, N=Asparagine, P-Proline, 
QOlutamine, R-Argininc, S-Serine, 
T=Threonine, V-Valine, W-Tryptophan, 
Y=Tyrosine, X=Untaaown, *=Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 














VRQEKRM SKATE VMMQY VENLKRTYEKDH 
AELMEFKKLANQNSSRSCGPSEDGVLRTARS 
MSLTLGKKMPRRRVSVAVVPKFNALNLPGQ 

^^-o.^nr n ii PCPD\inV(^?I PYTT^ A 1 PAT LE 

TPSSSSIPSIJ^ALSESrNOxu^JLr v 1 saltai,^ 

NGKTNGDPDCEASAPALTLSCLEELSQETKA 

RMEEEAYSKGFQEGLKKTKELQDLKEEEEEQ 

KSESPEEPEEVEETEEEEKDPRSSKLEELVHFL 

QVMYTKLCQHWQ V I WMMAA v ivxl yli wl 

GLYNSYNSCAEQADGPLGRSTCSAAQKDSW 

WSSGLQHEQPTEQ 


451 


1801 


A 


3623 


504 


198 


QUQHQTVHTGRKXYECKECGKAFNQUS 1 \A 

RHQRIHTGEKPYECKVCGKAFRVSSQLKQHQ 

RJJTTGERPYQCKELKGRGAEMLAVLAVKEQ 

NRTPVNYGK 


452 


1802 


— 


3628 


2 


195 


l^fcLHSAKAFHY*SSCS^SCEEGFALIGPEV V 
OCTALGVWTAPAPVCIAVQCQHLEALNEGT 
MG*DYPFTAFAYGSSCKYECHTVYRVRGLD 
MLH SRGC YLWNGHFTT* EAI SCEPLERPCH* S 
V^CSFSCEEGFALIGPEWQCTALGVWTAPAP 
VCIAVOCQHLEALNEGTMG 


453 


1803 


A 


3637 


662 


142 


1QAKGLGIWHVPNKSPMQHWR\KGSLLRYK1 

DTGFLQTLGHNLLGIYQKYPVKYGEGKCWT 

DNGPVIPVVYDFGDAQKTASYYSPYGQREFT 

AGFVQFRVFNNERAANALCAGMRV 1 OLM 1 1 

HHCIGGGGYFPEASPQQCGDFSGFDWSGYGT 

\HVGYSSSREITE\AAVLLFYR 


454 


1804 


A 


3641 


1 


362 


TQVHPAMLGLDELGRSGCGHC 1 QADLRFGD 
AAGRDPGQDNDRNTAEPAFPPPPRVMAAAA 

ALRAPAQ SS VTFEDV AVNFSLEb W bLLN 
GCLYHDVMLETLTLISSLGKVLILNCDLS 


455 


1805 


A 


3646 


2 


414 


" AAAGRGASGALTGEGGGEQGRRVGLUSKAM 
SLLLGPTFNS CQ VSSQPPRVAGLGLPLKHEPS 
RPOPPSPRGPRTVRAGVPGAHPQDTPCPEFVTl 
PRKVPLVGEAPGLPPEERbRG WKKD i ruwc 
SRVRAPSYDDIT 


456 


1806 


A 


3656 


396 


8 


" ' QIVSWSYLTLYTKNNLKSMKDLNVN itJVuK 
LLELKMHNLG*AKTFLNnQKALIKRKILIHW 
P/LIKIK/SFCSLSDTIKKMKRQTIVV/EQTFIIHI 
SVKELVSRTfEAFLQFWCTVNRPVFDIKKEQK 

F 


457 


1807 


A 


3660 


14 


1961 


" SEAKLGGPTGMDLWQLLLTLALAGSbUAT^U 
SEATAAILSRAPWSLQSVNPGLKTNSSKEPKF 
TKCRSPERETFSCHWTDEVHHGTKNLGPIQLF 
YTRROTQEWTQEWKECPDYVSAGENSCYFN 
SSFTSIWYCIKLTSNGGTVDEKCFSVDEIVQ 
PDPPLALNWTLLNVSLTGIHADIQVRWEAPRN 
ADIQKGWM^EYELQYKEVNETKWKMMDP 
ILTTSVPVYSLKVDKEYEVRVRSKQRNSGNY 
GEFSEVLYVTLPQMSQFTCEEDFYFPWLLIIIF 
GIFGLTVMLFVFLFSKQQR1KMLILPPVPVPKI 
KGIDPDLLKEGKLEEVNTILA1HDSYKPEFHS 
DDSWVEFIELDIDEPDEKTEESDTDRLLSSDH 
EKLHINLGVKDGDSGRTSCCEPDILETDFNAH 
D1HEGTSEVAQPQRLKGEADLLCLDQKNQNN 
SPYHDACPATQQPSV1QAEKNKPQPLPTEGAE 
STHQAAHIQLSNPSSLSNIDFYAQVSDITPAGS 
WLSPGQKNKAGMSQCDMHPEMVSLCQENF 
LMDNAYFCEADAKKCIPVAPHIKVESHIQFVS 
LNQEDIYllTESLnTAAGSrAGTGEHVPGSEM 
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SEQ ID 1 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A>=Alanine OCysteine, 
I>Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
QKjlutamine, R-=Arginine, S=Serine, 
T=Threonine, V- Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 














"PVPDYTSIHTVQSPQGLILNATALPLPDKEFLS 
SCGYVSTDQLNKIMP 


458 


1808 


A 


3663 


154 


462 


TRAPASGRSGAGLALSANAPDSGGHPGATEG 
PAGSLAHASGSARGTWRVRGRGSHGWERTV 
GAGGCANPVPALHSCASAPRGTGRVSALGPK 
TGSSPLSSPKG 


459 


1809 


A 


3664 


902 


135 


LGKYNTSMALFDFVLHNSTGEIRY1TEDDV1Q 

SQNALGKYNTSMALFESNSFEKTILESPYYVD 

LNQTLFVQVSLHTSDPNLVVFLDTCRASPTSD 

FASPTYDL1KSGCSRDETCKWYPLFGHYGRF 

QFNAFKFLRSMSSVYLQCKVLICDSSDHQSRC 

VNQGCVSRSKRDISSYKWKTDSHGPIRLKRDR 

SA\NGNSGFQHETHAEETPNQPFNSVHLFSFM 

VLALNVVTVATITVRHFVNQRADYQXYQKLQ 

NY 


460 


1810 


A 


3670 


850 


557 


LGILMSPQVEAGEI*ALL"TPPPGCMQhSPLll7r 
K* V/VSPGLTP/PPPEVPS VFLVEPGLPHAGQA 
GLDLL\TSGDPPASTSQSARTTDVSHRAQPLAI 

S 


461 


181 1 


A 


3671 


2472 


2099 


IGVLAFETGSCSVTRLYCIG1IMPHCSLDLAGS\ 
TSAFRIAGTTSVHHHPQLTFFFFWIETGSHCV 
VQTGL*LLALSNPPALASQ1AGISGMSHRAWP 
GLVLYSLEFSLLCASQSLIMLFTCYNE 


462 


1812 


A 


3672 


394 


110 


VKPVNGESKRD* GADTQTCEGEADEQLQT\N 
CYYD/STKSFFY1SCG*K\RKPTWAENRRLNA 
KMFG1PLHSNSDPWGYEEREVIGFHRSRVSRG 


463 


1813 


A 


3673 


348 


1 


QRNPFSAGHPQRPPTSGSQSELUVQPRLRPGR 
KSSFSRDQDVW* SQAVPKRQ*QRNPFS AGHP 
QRPPTSGSQSELLAQPRLRPGRKSSFSRDQDV 
WPGQKPRPSQQQHQMCASPTLGQRSPFALEP 
VPAYHGGRDPFASARPSPVGIPKPRAAPAGG 
GWRRIRPKSSTK 


464 


1814 


A 


3676 


2253 


320 


PVIQRCSQPYGFSLLISFFLKCVSETSQQPPbR 

KVFQLLPSFPTLTRSKSHESQLGNRIDDVSSM 

RFDLSHGSPQMVRRDIGLSVTHRFSTKSWLS 

QVCHVCQKSMIFGVKCKHCRIJCCHNKCTKE 

APACRISFLPLTRLRRTESVPSDINNPVDRAAE 

PHFGTLPKALTKKEHPPAMNHLDSSSNPSSTT 

FSTPSSPAPFPTSSNPSSATTPP\NPSP\GQR\DSR 

FNFPSC/AYFIHHR\Q\QFIFPDISAFAHAAPLPE 

AADGTRLDDQPKADVLEAHEAEAEEPEAGK 

SEAEDDEDEVDDLPSSRRPWRGPISRKASQTS 

VYLQEWDIPFEQVELGEPIGQGRWGRVHRGR 

WHGEVAIRLLEMDGHNQDHLKLFKKEVMN 

YRQTRHENVVLFMGACMNPPHLAIITSFCKG 

RTLHSFVRDPKTSLDINKTRQIAQEIIKGMGY 

LHAKGIVFIK1)LKSRNVFYDNG\KV VI J UfOLf 

\GISGWPVEGRRENQLKXSHDWLCYLAPEIVR 

EMTPGKJDEDQl-PFSKAADVTAFGTVWYELQ 

ARDWPLKNQAAEASIWQIGSGEGMKRVLTS 

VSLGKEVSENLSACWAFDLQERPS\FSLLMD 

MLEIO_PKLNRRLSHPGIiF*KSADINSSKVVPR 

FERFGLGVXESSNPKM 


465 


1815 


A 


3679 


8 


803 


' IPSPAWWNSTWADTFSLLLALAVALYLGYY 
WACVLQTHRAFCASNTEDLETVV>miKHR 
QAPLLAVGISFGGILVXNHLAQARQAAGLVA 
ALTLSACWDSFETTRSLETPLNSLLFNQPLTA 
GLCQLVERLSY/E*DLQARTIRQFDERYTSVA 
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FG YQDC VTYYKAASPRTKiD AIRIP VL Y Lb AA 
DDPFSTVCALPKQAA QHbr i v j aj\uu! " 
GFLEGLLPWQHWYMSRLLHQYAKAIFQDPE 

GLPDLRALLPSEDRN S 


466 


1816 


A 


3684 


3 


307 


SSQY1VQSKTKIFL* AAREKQ/RHTCRR1- SiKLb 
ANISSQTGEARGQWPSVFKVLKEKKLSTKKS 
FGQK*GR\RKTFPDKQK/LREFDTTRPTIQEML 

TGV1 ,QG 


467 


1817 


A 


3687 


2465 


837 


"ELPTPL1AAHQLYNYV ADHASS V HMKPLRMA 
RPGGPEHNEYALVSAWHSSGSYLDSEGLRHQ 
DDFDVSLLVCHCAAPFEEQGEAERHVLRLQF 
FWLTSQRELFPRLTADMRRFRKPPRLPPEPE 
APGSSAGSPGEASGLILAPGPAPLFPPLAAEVG 
MARARLAQLVRLAGGHCRRDTLWKRLFLLE 
PPGPDRLRLGGRLALAELEELLEAVHAKSIGD 
IDPQLDCFLSMTVSWYQSL1KVLLSRFPQSCR 
HFQ SPDLGTQ YL WLN QKFTDCF VLVFLDSH 
LGKTSLTWFREPFPVQPQDSESPPAQLVSTY 
HHLESVINTACFTLWTRLL*GSGLDH*MSLFL 
ES W AYQLACQRQD* P ALLGPRASQTLSDTKG 
FVTMS*GSAAPAWQQEPPSPNTHSH*PIQDSR 
ESGQPRGPLGPFWGTPFGPPGRVSGVHTGWQ 
TPPRAPLPESCPLNPLTTVSHLCPLSLRVFTSHL 
D1TAGHSHRDDTWVPIPALPLKHLRPPSSPFA 
LGPWVSHPLMRWVQKLSHLHSNPGTGFSMU 

GKQQPN - 


468 


1818 


A 


3691 


960 


499 


■0TCRKDKRAIYPHFQNE*MNKIKAI*SGTGG1 
QCFHSQNDSAFFFFLFLLETEFCSAAyTVQWH 
DFLSMQPPPPGFKQFTCLSLLSSWNYRRNPPPF 
PGNF\*FLVKTGFPHVGQTGFELLTSSDLAPLA 
SON / r r rT Tr3MSpr A WPFFFFFFFGLC 


469 


1819 


A 


3714 


4747 


495 


MA'YSWQTDPNPNESHEKQ VEHQEFLh VN^f 

HSSSQVSLGFDQIVDEISGKIPHYESEIDENTFF 

VPTAPKWDSTGHSLNEAHQISLNEFTSKSREL 

SWHQVSKAPA1GFSPSVLPKPQNTNKECSWG 

SPIGKHHGADDSRFSILAPSFTSLDKINLEKEL 

ENENHNYHIGFESS1PPTNSSFSSDFMPKEENK 

RSGHVNIVEPSLMLLKGSLQPGNfWESTWQK 

N1ESIGCSIQLVEWQSSOTS1ASFCNKVKKIR 

ERYHAADVNFNSGKIWSTTTAFPYQLFSKTK 

FN1HIF1DNSTQPLHFMPCANYLVKDLIAE1LH 

FCTNDQLLPKDHILSVWGSEEFLQNDHCLGS 

HKMFQKDKSVIQLHLQKSREAPGKLSRKHEE 

DHSQFYLNQLLEFMHIWKVSRQCLLTLIRKY 

DFHLKYLLKTQENVYNIIEEVKKICSVLGCVE 

TKQITDAVNELSLE.QRKGENFYQSSETSAKG 

LIEKVTTEl^TSIYQLINVYCNSFYADFQPVNV 

PRCTSYLNPGLPSHLSFTVYAAHNIPETWVHR 

INFPLEIKSLPRESMLTVKLFGIACATNNANLL 

AWTCLPLFPKEKSILGSMLFSMTLQSEPPVEM 

riTGVWDVSQPSPVTLXJIDFPATGWEYMKPD 

SEENRSNLEEPLKECIKH1ARLSQKQTPLLLSE 

EKKRYLWFYRFYCNNENCSLPLVLGSAPGW 

DERTVSEMHT1LRRWTFSQPLEALGLLTSSFP 

DQEIRKVAVQQLDNLLNDELLEYLPQLVQAV 

KFEWNLESPLVQLLLHRSLQSIQVAHRLYWL 

LKNAENEAYFKSWYQKLLAALQFCAGKALN 

DEFSKEQKLIKILGDIGERVKSASDHQRQEVL 

KKE1 G RLEEFFQD VNTCHLPLNP ALCIKGIDH 

DACSYFTSNALPLKJTFINANLMGKN1SI1FKA 
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1820 



471 



472 



473 



474 



1821 



1822 



1823 



1824 



3718 



3723 



3734 



3746 



3753 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 



430 



891 



443 



75 



494 



251 



500 



5262 



Amino acid sequence (A=Alanine OCysteme, 
D=Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, G=Glycwt, H^Histidine, 
I=lsoleucine, K^Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=01utaminc, R-Arginine, S^Serine, 
T=Thrconine, V=Valine, W=Tryptophan, 
Y-Tyrosine, X^Unicnown, *=Stop codon, 
/=possible nucleotide deletion, V=possib!e 

au cleotide insertion 

GDDLRQDMLVLQUQVMDN1WLQEGLUMQ 



MUYRCLSTGKDQRLVQMVPDAVTLAK1HRH 
SGLIGPLKENTIKK^SQHNHLKADYEKALR 
NFFYSCAGWCWITILGVCDRHNDNIMLTKS 
GHMFHIDFGKFLGHAQTFGGDCRDRAPFIFTS 
EM\EYFITEGG\KNPQKFQDFV\ELCCRAYNI1R 
KHSQLLL\NLL\EMMLYAG\LPELSGI\QDLKY 
VYKNLRPQDTDLEATSHFTKXIKESLECFPVK 
LNNLEHTLAQMSAISPAKSTSQTFPQESCLLST 
TRSIERAT1LGFSKKSSNLYL1QVTHSNNETSL 
TEKSFEQFSKLHSQLQKQFASLTLPEFPHWW 
HLPFTNSDHRRFRDLNHYMEQILNVSHEVTN 
SDC VLSFFL SE AGQQTVEES SP VYLGEKFPDK 
KPKVQLVISYEDVKLTILVTCFIMKNIHLPDGSA 
PS AHVEF YLLP YP SE VRRRKTKS VPKCTD PT Y 
NEIWYDEVTELQGHVLMLIVKSKTVFVGAI 
NIRLCSVPLDKEKWYPLGNSU*PLLLFSSFGM 

KSLE KDEFVGGMLLSNPIW 

SHGSISILNLHQGCVFLPSLPAQGLRCYRCLA 
VLEGASCSWSCPFLDGVCVSQKVSV/CWQ*/ 
CPWGARAEGRLSAWDSQISCCKGDLCNAV 
VLAAGSP WALCVQLLLSLGSVFLWALL 
"lrqsl/n s VPQ AG vq wrd s slq APPPRFTPLS 
CLSLPSSWDYRRLPPCLANFLYF* -RRGFTML 
ARMVLIS*PRDPPASASQ\STEITGGSHRAQHP 
TDSRDHSERSVKKSHEVISELRMKVIKCKVAF 

SKNPI ' 

GFIET*NFCVSKDTSKKLS/RLPTKWKNVFAN 

MSDKGLVSRICQELLRHLDAEQVSSTAGLSL 

THASGGARSGAGWAGRGVRAGTEAGRGG1F 

LTLSILRTRDLPSGAMSEGVDLIDIYADEEFNQ 

DPEFNNTDQIDLYDDVLTATSQPSDDRSSSTE 

PPPPVRQEPSPKPNNKTPAIL^TYSGLRNRRA 

AVYVGSFSWWTTDQQUQVIRSIGVYDVGEV 

KFAE NRAK 

RPLF AREGGI YAVLVCMQEYKTS V\L VQQAU 

LAALKMLAVASSSEIPTFVTGRDSIHSLFDAQ 

MTREIFASIDSATRPGSESLLLTVPAAVILMLN 

TEGCSSAARNGLLLLNLLLCNHHTLGDQIJTQ 

ELRDTLFRHSGL'VPRTEPMPTTRTILMMLLNR 

YSEPPGSP\ERAALETPIIQGQDGSPELLIRSLV 

GGPS AELLLDLERVLCREGSPGG A VRPLLKRL 

QQETQPFLLLLRTLDAPGPNKTLLLSVLRVIT 

RLLDFPEAMVLPWHEVLEPCLNCLSGPSSDSE 

IVQELTCFLHRLASMHKDYAWLCCLGAKEI 

LSKVLDKHSAQLLLGCELRDLVTECEKYAQL 

YSNLTSSILAGCIQMVLGQ1EDHRRTHQPINIP 

FFDVFLRHLCQGSSVEVKEDKCWEKVEVSSN 

PHRASKLTDHNPKTYWESNGSTGSHYiTLHM 

HRG VL VRQLTLL V A S ED S S YMP AR WVFGG 

DSTSCIGTELNTVNVMPSASRVILLENLNRFW 

PIIQIRIKRCQQGGIDTRVRGVEVLGPKPTFWP 

LFREQLCRRTCLFYTIRAQAWSRD1AEDHRRL 

LQLCPRLNRVLRHEQNFADRFLPDDEAAQAL 

GKTCWEALVSPLVQN1TSPDAEGVSALGWLL 

DQYLEQRETSRNPLSRAASFASRVRRLCHLL 

VHVEPPPGPSPEPSTRPFSKNSKGRDRSPAPSP 

VLPSSSLRNTTQCWLSWQEQVSRFLAAAWR 

APDFVPRYCKLYEHLQRAGSELFGPRAAFML 
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ALRSGFSG ALLQQSFLTAAHMSEQFARY 1DQ 

QIQGGLIGGAPGVEMLGQLQRHLEPIMVLSG 

LELATTFEHFYQHYMADRLLSFGSSWLEGAV 

LEQIGLCFPNRLPQLMLQSLSTSEELQRQFHLF 

QLQRLDKLFLEQEDEEEKRL* EEEEEEEEEEA 

EKELFIEDPSPAISILVLSPRCVSTVSPLCYLYHP 

RKCLPTEFCDALDRFSSFYSQSQNHPVLDMG 

PHRRLQWTWLGRAELQFGKQILHVSTVQMW 

LLLKFNQTEEVSVETLLKDSDLSPELLLQALV 

PLTSGNGPLTLHEGQDFPHGGVLRLHEPGPQ 

RSGEALWLIPPQAYLNVEKDEGRTLEQKRNL 

LSCLLVRILKAHGEKGLH1DQLVCLVLEAWQ 

KGPOTPGTLGHTVAGGVACTSTOVLSCILHLL 

GQGYVKRRDDRPQILMYAAPEPMGPCRGQA 

DVPFCGSQSETSKPSPEAVATLASLQLPAGRT 

MSPQEVEGLMKQTVRQVQETLNLEPDVAQH 

LLAHSHWGAEQLLQSYSEDPEPLLLAAGLCV 

HQAQAVPVRPDHCPVCVSPLGCDDDLPSLCC 

MHYCCKSCWNEYLTTRIEQNLVLNCTCPIAD 

CPAQPTGAFIRATVSSPEVISKYEKALLRGYVE 

SCSNLTWCTNPQGCDRILCRQGLGCGTTCSK 

CGWASCFNCSFPEAHYPASCGHMSQWVDDG 

GYYDGMSVEAQSKHLAKLISKRCPSCQAPIE 

KNEGCLHMTCAKOSIHGFCWRCLKSWKPNH 

KDYYNCSAMVSKAARQEKRFQDYNERCTFH 

HQAREFAVNLRNRVSAIHEVPPPRSFTFLNDA 

CQGLEQARKVLAYACVYSFYSQDAEYMDVV 

EQQTENLELHTNALQILLEETLLRCRDLASSL 

RLLRADCLSTGMELLRRIQERLLAILQHSAQD 

FRVGLQSPSVEAWEAKGPNMPGSQPQASSGP 

EAEEEEEDDEDDVPEWQQDEFDEELDNDSFS 

YDESENLDQETFFFGDEEEDEDEAYD 


475 


1825 


A 


3754 


1093 


"96 


GTSRNQHSPKTHA*RSS/WPQPPPLFLPPLU^ 

ATGRRRRRTRTQQRTAALLTDGTTKTGAAW 

SRRPSLC WPSRTTGAPGAK* AVL VRS ATPTTN 

PPNPQSPTGAAGKLRAPGNRAG/SEPSSQEPPP 

DGTRVRPASITGVAQSPATRATPSLPCLHVPAP 

SRGQTLGVRTTGRASRLTVDRSRLSWPGRSA 

RS GGGR WRPN APRGRWPRAP* SWEPGSWTE 

PWRWPrTAAESPPHRCIYCTNHVSPAGPARPS 

HVYIIRATINSISHPLCRAQSSPWEAAGVWRR 

PAQPAPTSDVNINLLRKPRVKRHDLIYQFLGN 

TLWEEGRQRPPETLQPAR 


476 


1826 


A 


3758 


901 


521 


FFFGNG VSPCPQ AGV* WHDLDSLQNLPFUr 1 R 
RFSYLSLPSSW\DYRHVPPRQANFCIF/M*RRG 
FTMLARMVSIS*PRDLPALASQSAGITGVSHH 
APPQMDFTFALLCFAPKGCLPRQKEGGTLNLI 


477 


1827 


A 


3761 


843 


575 


" GVISAHCNLRL/CHLPGSSNSPASASQVAGUU 
ARTTPS* IFVFLVETGFHHVSQDGLDLLWV1 
RPRRPLKVLGLQACTRARLPSPLKEL 


478 


1828 


A 


3763 


267 


1240 


HLLSFHLWSASLDCLEQLSQERHVKGMLLUP 
PPVNESTKPSPSPWKLTPPMCSIPPVFPPKSGS 
PTTSWS/PSGHSKLEVERAQTGPFCLHIYCP*P 
GVTDNTTSLLHYlPFPRLXSGLVCFrAH'rra i 
WTGHSFASQAWLRQVPEVSKHLQCPSAESLL 
TMEYHQPEDPAPGKAGTAEAV1PENHEVLAG 
PDEHPQDTDARDADGEAREREP/RRPSFAA*P 
1 VWGQPVESPLPEASSAPPGrTLGTLPEVETIRA 
CSMPQELP*SPRTRQPEPDFYCVKWIPWKGE 
QTPIITQSTNGPLPSPCHHEHPLSSVEGEAPPA 
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EGSDH1G 


479 


1829 


A 


3766 


2 


2152 


YSPLRLLEVCVPLPKiFlKRQAPLKVSLLQDLK 

DFFQKVSQVYVAJDERLASLKTDTFSKTREEK 

MEDIFAQKEMEEGEFKNWIEKMQARLMSSS 

VDTPQQLQSVFESL1AKKQSLCEVLQAWNNR 

LQDLFQQEKGRKRPSVPPSPGRLRQGEESKIS 

AMDASPRNISPGLQNGEKEDRFLTTLSSQSST 

S STHLQLPTPPE VMSEQ S V G GPPELDT ASS SE 

DVFDGHLLGSTDSQVKEKSTMKAIFANLLPG 

NSYNPIPFPFDPDKHYLMYEHERVPIAVCEKE 

PSSI1AFALSCKEYRNALEELSKATQWNSAEE 

GLPTNSTSDSRPKSSSPIRLPEMSGGQTNRTTE 

TEPQPTKKASGMLSFFRGTAGKSPDLSSQKRE 

TLRGADSAYYQVGQTGKEGTENQGVEPQDE 

VDGGDTQKKQLINPHVELQFSDANAKFYCRL 

YYAGEFHKMREVlLDSSEEDFmSLSHSSPWQ 

ARnn^sriAAFYATEDDRFILKOMPRUEVQSF 

L D F APH YFN Y I TN A V QQKRPT AL AK1LG VYRJ 

GYKNSQNNTEKKLDLLVMENLFYGRKMAQ 

vpni KHSl RNRNVKTDTGKESCDVVLLDENL 

LKMVRDNPLYIRSHSKAVLRTSIHSDSHFLSS 

HLIIDYSLLVGRDDTSNELWGIIDYIRTTTWD 

KKLEMVVKSTGILGGQG'MPTWSPELYRTR 

FCEAMDNYFLMVPDHCTGLGLNC 


480 


1830 


A 


3777 


251 


3 


" QGCGS AGTLIHY* *ECKMVQLLWKTV* QFLl 
KXNrvKDPAITLDVYPNEVKNYVRTKTYTQMF 

I/ANFIMAKSWKQPTHPSVRT 


481 


1831 


A 


3779 


333 


3 


FAAIROPEPNILDVNOIFKDLAMUHDQGDL1U 
SIEANAESSEVLVERAPGQLQRPAVYYQKICSR 
KJCMCLVVLVQTAIILICERIM*VVYTTKWSPPI 
VLPVSCFQGQKFN 


482 


1832 


A 


3780 


2 


371 


TGGRQGKNDHTSITEKPSRDFNRHL1TQN1 ¥ M 
PNQDMKSSSNSLIIRKVQIKPmYHHIFTRKA 
KMKTTDKTKYR* GFKAITTLIHC SQDClyLQ* S 
/L» ENHFM1FPKAEQHITYDTTLPFLR 


483 


1833 


A 


3787 


43 


448 


" LMKDL S P YVMETHYILN RLNER/RS M WRHI 1G 
FCLPNTKDQEKILKAIRGRREVIQGS/RQQYRR 
PAAFSAAEKARRLWCS/VFNIERRNL/CEYPTK 
LSFNIKGEMTFSDKTEFTrNRPSLKMLLKDRI 
OEEGKMF* KEKCFKRKE 


484 


1834 


A 


3798 


1 


TIT 

72/ 


" FFFFETESRSVAQAGVQWCNLGSLQALPPGFX 
SHSPASASRVAGTTGTRH*ARLIFYIFSRDGVS 
PC* PGWS* SPDLVIRPPVRLPKC WD YRREPPRP 
A*FFWLVE\QGlTTvlI^MlMVSIS*PQ/CDLPAS 
VSQNAGITGVSHCAWPCLHFCFFGFFFEMESC 
SVAQAEVQWHDLRSLQAPPPGFTPFSCLSLPG 
SWDYRRPPPRPANrACIFSRDGVSPC'PGWSRS 
PDLV1RPPRPPKVLGLQA 


485 


1835 


A 


3802 


1 


239 


" FFFFEMECLTVSQAGVQWYNLHSLQPLPPCih 
KQFSOLSLPSSWD*RVPTSRPAKF/CVIF*DGV 

SHCQPGWSAWQPPLH 


486 


1836 


A 


3811 


378 


98 


" RYD*SSQSENIP\QKEFLLKYP*CTATLGMR>« 
MSIMKKK SIFS AEF YK VSLPSLLLXHLLAlt w o 
FHIElOLTIHQHFLNYELESDFVHrVEYM 


487 


1837 


A 


3814 


771 


320 

_j 


- fdpdWTRAAGIRHEKKPICALAYRRENSPGUL 
PPPPLPPPEEEASWAL/GAEGSRQHVLPGAGA 
QWGEESGPGRAPGSPAGAPPR'RGLAPVNSRP 
SFLSRGQGTSTCSTAGSNSSRGSSSSRGSRGPG 
RSRSRSQSRSQSQRPGQKRREEPR 
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Vmino acid sequence (A- Alanine C~Cy steme, 
>Aspartic Acid, E-Olutarnic Acid, 
-Phenylalanine, G=Glycinc, H=Histidine, 
=Isoleucine, K-Lysine, L=Leucine, 
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5=Glutamine, R=Arginine, S=Serine, 
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y=Tyrosine, X-Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion _ 


488 


1838 


A. 


3818 


1 


781 


FRACLlJSLIPYAPaS^rACPPAMAGPROLl^ 

LCLIJ^CLAGFSFVRGQVU^GCDVKTTFV^ 

HVPCTSCAAIXKQTCPSGVvXRELPDQlTQDCR 

YEVQLGGSMVSMSGCRRKCRKQVVQKACCP 

GYWGSRCHECPGGAETPCNGHGTCLDGMDR 

NGTCVCQENFRGSACQECQDPNRFGPDCXJSV 

CSCVHGVCNHGPRGDGSCLCFAGYTGPHCD 

QELPVWQELGFPQNNPRLRKAPNCKCLPG*H 

RNGLLATPNPCRP 


489 


1839 


A 


3822 


934 


"669 " 


FFFSEMESRSVTRLECSGAlSAHLRiXObSW^P 
ASAS* VAGTIGACHHAQL1FVFLVETGFHHVG 
nnGLDLLTNLMIHPPRPPKVLGFQA 


490 


1840 


A 


3825 


79 


9748 


GCQSCWPAWPRLRRRGPASAOAiU-LxKls-AP W 
GLPGRVQDGRPLRFCFYLRPRAPFIAPVLSGA 
ASRPEASGDCRAGRETAMATLEKLMKAFESL 

KSFQQQQQQQQQQQQQQQQQQ^^oo 

PPPPPPPQLPQPPPQAQPLLPQPQPPPPPPPPPP 

GPAVAEEPLHRPKKELSATKKDRVNHCLTIC 

EN1VAQSVRNSPEFQKLLG1AMELFLLCSDDA 

ESD VRMVADECLNK VIKALMD SNLPRLQLEL 

YKEIKKNG APRSLRAAL WRF AELAHL VRPQK 

CRPYLVKLLPCLTRTSKJIPEESVQETLAAAW 

KIMASFGNFANDNElKVLLKAFlAr^KSSSFn 

RRTAAGSAVS1CQHSRRTQYFYSWLLNVLLG 

LLVPVEDEHSTLL1LGVLLTLRYLVPLLQQQV 

KDTS LKG SFG VTRKEMEV SPS AEQL VQ VYEL 

TLHHTQHQDHNVVTGALELLQQLFRTPPPEL 

LQTLTAVGGIGQLTAAKEESGGRSRSGSIVELI 

AGGGSSCSPVLSRKQKGKVLLGEEEA1XDDS 

ESRSDVSSSALTASVKDEISGELAASSGVSTPG 

SAGHDIITEQPRSQHTLQADSVDLASCDLTSS 

ATDGDEEDILSHSSSQVSAVPSDPAMDLNDG 

TQASSPISDSSQTTTEGPDSAVTPSDSSEIVLD 

GTDNQYLGLQIGQPQDEDEEATGILPDEASEA 

FRNSSMALQQAHLLKNMSHCRQPSDSSVDKF 

VLRDEATEPGDQENKPCRIKGDIGQSTODDS 

APLVHCVRLLSASFLLTGGKNVLVPDRDVRV 

SVKALALSCVGAAVALHPESFFSKLYKVPLD 

TTEYPEEQYV SDILN YDDHGDPQ VRGATAILC 

GTL1CSILSRSRFHVGDWMGTIRTLTGNTFSL 

ADCIPLLRKTLKDES S VTCKL ACT A VRNC VM 

SLCSSSYSELGLQLI1DVLTLRNSSYWLVRTEL 

LETLAEIDFRLVSFLEAKAENLHRGAHHYTGL 

LKLQERVLNNVVIHLLGDEDPRVRHVAAASL 

naVPKLFYKCDQGQADPWAVARDQ^VYL 

KLLMHETQPPSHFSVSTITWYRGYNLLPSITD 

VTMENr^SRVlAAVSHELrrSTTRALTFGCCE 

ALCLLSTAFPVCIWSLGWHCGVPPLSASDESR 

KSCTVGMATMILTLLSSAWFPLDLSAHQDAL 

ILAGNLLAASAPKSLRSSVVASEEEANPAATK 

OEEVWP ALGDRALVPMVEQLFSHLLK VINIC 

AHVLDDVAPGPA1KAALPSLTNPPSLSPIRRK 

GICEKEPGEQASVPLSPKKGSEASAASRQSDTS 

fiPVTTSKSSSLGSFYHLPSYLKLHDVLKATHA 

NYKVTLDLQNSTEKFGGFLRSALDVLSQILEL 

ATLQDIGKCVEEILGYLKSCFSREPMMATVC 

VQQLLKTLFGTNLASQFDGLSSNPSKSQGRA 

ORLGSSSVRPGLYrrr-CFMAPYTHFTQALADA 

SLRNMV'QAEQENDTSGWFDVLQKVSTQLKT 

NLTSVTKNRADKNAIHNHIR1JEPLVIKALK 
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Amino acid sequence (A=Alanme C=Cystetne, 
D-Aspartic Acid, E-Glutamic Acid, 
F=Phenylalanine, OOlyrine, H=Histidinc, 
l=lsoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P- Proline, 
Q^Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W==Tryptophan, 
Y-Tyrosine, X-Unknown, *=Stop codon, 
/=possible nucleotide deletion, \=possiblc 
nucleotide insertion 












! 


YTTTTCVQLQKQVLDLLAQLVQLRVNYCLL 

DSDQVFIGFVLKQFEYIEVGQFRESEA1IPNIFF 

FLVLLSYERYHSKQnGIPKIIQLCDGlMASGR 

KAVTHAIPALQPIVHDLFVLRGTNKADAGKE 

LETQKEVWSMLLRLIQYHQVLEMFILVLQQ 

CHKEN^DKWKiaSRQlADiaPMLAKQQMHI 

DSHEALGVLNTLFEILAPSSLRPVDMLLRSMF 

VTPNTMASVSTVQLWISGILAILRVLISQSTED 

IVLSRIQELSFSPYLISCTVINRLRDGDSTSTLE 

EHSEGKQIKNLPEETFSRFLLQLVG1LLEDIVT 

KQLKVEMSEQQHTFYCQELGTLLMCLIHIFKS 

GMFRRTTAAATRLFRSDGCGGSFYTLDSLNLR 

ARSMirTHPALVLLWCQILLLVNHTDYRWW 

AEVQQTPKRHSLSSTKLLSPQMSGEEEDSDLA 

AKLGMCNREIVRRGALILFCDYVCQNLHDSE 

HLTWLIVNH1QDLISLSHEPPVQDFISAVHRNS 

AASGLFIQA1QSRCENLSTPTMLKKTLQCLEGI 

HLSQSGAVLTLYVDRLLCTPFRVLARMVDIL 

ACRRVEMLLAANLQSSMAQLPMEELNR1QEY 

LQSSGLAQRHQRLYSLLDRFRLSTMQDSLSPS 

PPVSSHPLDGDGHVSLETVSPDKDWYVHLVIC 

SQCWTRSDSALLEGAELVNRIPAEDMNAFM 

MNSEFNLSLLAPCLSLGMSEISGGQKSALFEA 

AREVTLARVSGTVQQLPAVHHVFQPELPAEP 

AAYWSKLNDLFGDAALYQSLPTLARALAQY 

LWVSKLPSHLHLPPEKEKDrVKFWATLEAL 

SWHLIHEQIPLSLDLQAGLDCCCLALQLPGL 

WSWSSTEFVTHACSLIYCVHFILEAVAVQPG 

EQLLSPERRTNTPKAISEEEEEVDPNTQNPKYI 

TAACEMVAEMVESLQSVLALGHKRNSGVPA 

FLTPLLRNIHSLARLPLVNSYTRVPPLVWKLG 

WSPKPGGDFGTAFPEIPVEFLQEKEVFKEFrYR 

INTLGWTSRTQFEETWATLLGVLVTQPLVME 

QEESPPEEDTERTQINVLAVQAITSLVLSAMT 

VPVAGNPAVSCLEQQPRNKPLKALDTRFGRK 

LSIIRGrVTEQEIQAMVSKRENIATHHLYQAWD 

PVPSLSPATTGALISHEKLLLQINPERELGSMS 

YKLGQVSIHSVWLGNSITPLREEEWpEEEEEE 

ADAPAPSSPPTSPVNSRKHRAGVDIHSCSQFL 

LELYSRWTLPSSSARRTPAILISEWRSLLVVS 

DLFTERNQFELMYVTLTELRRVHPSEDEILAQ 

YLVPATCKAAAVLGMDKAVAEPVSRLLESTL 

RS SHLPSRVG ALHGVL YVLECDLLDDTAKQL 

IPV1SDYLLSNLKG1AHCVNIHSQQHVLVMCA 

T AFYLIENYPLD VGPEFS ASIIQMCG VMLS GS 

EESTPSnYHCALRGLERLLLSEQLSRLDAESL 

VKLSVDRVNVHSPHRAMAALGLMLTCrvfYT 

GKEKVSPGRTSDPNPAAPDSES VI V AMERV S 

VLFDRIRKGFPCEARWARILPQFLDDFFPPQ 

DIMNKV1GEFLSNQQPYPQFMATWYKVFQT 

LHSTGQSSMVRDWVMLSLSNFTQRAPVAMA 

TWSLSCrTVSASTSPWVAAILPHVISRMGKLE 

QVDVNLFCLVATDFYRHQIEEELDRRAFQSV 

t FWAAPGSPYHRLLTCLRNVHKVTTC 


491 


1841 

t 


A 


3826 


469 


302 


" SNPPASASRVAGlTGVHQHAWUFVFLVliMbh 
HHVGOAVLKLLISGDLPVSASQSA 


492 


1842 


A 


3836 


392 


88 


VAPSPMIMPDLYFYRDPEElEKEE*AAAbK\fcb 
FQSEWTAW/P/EFTATQSEVADWFKDMQVP 
SVPIQQFPTEDWST*PTMNDWSATSTAQTTE 
WVRITTEWP 
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/=possible nucleotide deletion, \=possibie 
nucleotide insertion 




1843 


A 


3838 


19 


380 


TPSDMNRAFETDTQSlGEKNRSPSEPUYhbKK 
KFKRS*EKAHIR YKIDQPfciJirL^v£,r lAJvnorv 
CT ATL SMJUs MSLMKKKC SF SEEIALAFFPSLL 
wrH] f A rvi /5FYTFIHLTTFNNTF 


494 


1844 


A 


3845 


2 


352 


FFFLRRSL/DSVAQAEAQWLVELGLLQAPfPUt- 
KPISLP\GLPSSWDYGRPPPCPANFCIF/M*RRG 
FTVLARMVLIS*PCDPPTLASQGTATTGMSYH 
Aupomnn YAHOGRCWFRIX 


495 


1845 


A 


3847 


1774 


40 


DIFFRRAKEGMGQDEAQFSVKMPLTGKAYL 

W ADK YRPRKPRFFNRVHTGFE WNKYNQTHY 

DFDNPPPKiVQGYKFNTFYPDLIDKRSTTEYEL 

EACADNKDFA1LRFHAGPPYEDIAFKIVNREW 

E Y SHRHGFRCQF ANGIFQLWFHFKRYRYRR* 

RPWGTAGRCPRGHSKGASVKLWTPGPLSGL 

QGRGFTSHLRPHLSFARPQFPPI*KGGHH*AC 

HGELRRHWDRLA* GPDATEGALG ASFEHEG 

GOQPPADLTVQADTLHRPSARLGGAHRACPK 

RRPHRVLWRWARGAWAWRCQAREKQETQG 

QPCHITGHPLGREAEPAAAG AAP ALAHRPPF 

ARTGSTEXPGPCWRPIRHCRRDPLWTPTLORD 

WPPTHPVLAGGVHFPAAG/IGGCVEVPVSVN 

VMGTKSH*AVLPPPPSTGPGGQGLPEGWGLE 

KGEGLPPGIPPPGLLTGPVASMRPVTPSFAH1R 

TVAPSHSPFSGQEGRGPHGCHSPGR\SGP\AGR 

LVLQHPTGTSPTEAKRKVPPGPPEGHPTSPVT 

SPRPPTAPPRHPASSGNSSVCFSKKTCRWEKK 

SFVLMELAYWQDRMFF 


496 


1846 


A 


3849 


830 


442 


AKSPLPLG* IQWR/NLGSLKLRLPGFK* Kl LXU 
LLSSWDYRSLPPRPVNFCILVELGFHHVDQAG 
LKLLTSSALPALASQSAEITGMSHR1WPLPLLR 
RPPVIRIRAPPQRLPFNLITSLKALSPNMATF 


497 


1847 


A 


3859 


2 


393 


ALRKTRRDG1ARTG AQP AAS WKGTN N^WK 
LEMAGRPGSQEQSKDRGTGSLPPPSQRPLGPS 
PEGAGPSPPPPGIPRGGGS bbUr/ rViA>r vri%. 
RFPAPKKGLPSDTPHSKAPPTPHLILGGEDSQ 

VPTT. 


498 


1848 


A 


3860 


253 


634 


KN ASTV YS SQGDPK.SFFFLLRW SLAL V A^AU 
EO*RDLSSLQPPPPGFK*FSCLSLPSSWD\YRCP 
LPCLANFN*FLVETGFHHVGQADLKLLTSGDP 
PTS ASES AGITGVSHRAWPRIHt L Y WMrrL 


499 


1849 


A 


3863 


423 


263 


- - APSQIS V AFLY AA/DKLFEKEI* KKIPFUAS/UKi 
KIGINLTKEVKYLYTENYITLMKEIIODTDKW 
KX)ILY*WIGKIN1 + KMSTPPKAIYRFNAIPTKIP 
MTFFTEIEK^IIKFI WN11KKPPNTQSNIEQKE* S 
FCSILLWWGGFLWFHMNFMI0FSISVKNVIGI 
T vr,TAT>JT 


500 


1850 


A 


3865 


2 


15246 


" LPRGCL WCLQRSPTP ARPQPSRP ARSPLFLr r 
DLRPWASDLD1MGDAEGEDEVQFLRTDDEV 
VLQCSATVLKEQLKLCLAAEGFGNRLCFLEP 
TSNAQNVPPDLAICCFVLEQSLSVRALQEML 
ANTVEAGVESSQGGGHRTLLYGHAILLRHAH 
SRMYLSCLTTSRSMTDKLAFDVGLQEDATGE 
ACWWTMHPASKQRSEGEKVRVGDDITLVSVS 
SERYLHLSTASGELQVDASFMQTLWNMNPIC 
SRCEEGFVTGGHVLRLFHGHMDECLTISPADS 
DDQRRLVYYEGGAVCTHARSLWRLEPLRIS 
WSGSHLRWGQPLRVRHVTTGQYLALTEDQG 
LVWDASKAHTKATSFCFRISKEKLDVAPKR 
DVEGMGPPEIKYGESLCFVOHVASGLWLTYA 
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D=Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
Msoleucine, K=Lysine, L=Leucine, 
M^Methioninc, N=Asparaginc, P=Prolinc, 
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T^Threonine, V^Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop cod on, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 














APDPKALRLGVLKKKAMLHQEGRMDDALSL 

TRCQQEESQAARM1HSTNGLYNQFIKSLDSFS 

GKPRGSGPPAGTALPlEGVILSLQDLirYFEPPS 

EDLQHEEKQSKXRSLRNRQSLFQEEGMLSMV 

LNCEDRLNVYTTAAHFAEFAGEEAAESWKEI 

VNLLYELLASLIRGNRSNCALFSTNLDWLVS 

KLDRLEASSG1LEVLYCVLIESPEVLNUQENHI 

KSIISLLDKHGRNHKVLDVLCSLCVCNGVAV 

RSNQDL1TENLLPGRELLLQTNLINYVTSIRPN 

ffVGRAEGTTQYSKWYFEVMVDEVTPFLTAQ 

ATHLRVGWALTEGYTPYPGAGEGWGGNGV 

GDDLYSYGFDGLHLWTGHVARPVTSPGQHL 

LAPEDVISCCLDLSVPS1SFRINGCPVQGVFESF 

NLDGLFFPWSFSAGVKVRFLLGGRHGEFKF 

LPPPGYAPCHEAVLPRERLHLEPIKEYRREGP 

RGPHLVGPSRCLSHTDFVPCPVDTVQIVLPPH 

LER1REKLAENIHELWALTRIEQGWTYGPVRD 

DNKRLHPCLVDFHSLPEPERNYNLQMSGETL 

KTLLALGCHVGMADEKAEDNLKKTKLPKTY 

MMSNGYKPAPLDLSHVRLTTAQTTLVDRLAE 

NGHNVWARDRVGQGWSYSAVQDIPARRNPR 

LVPYRLLDEATKRSNRDSLCQAVRTLLGYGY 

NIEPPDQEPSQVENQSRCDRVRIFRAEKSYTV 

QSGRWYFEFEAVTTGEMRVGWARPELRPDV 

ELGADELAYVFNGHRGQRWHLGSEPFGRPW 

QPGDVVGCMIDLTENTIIFTLNGEVLMSDSGS 

ETAFREIEIGDGFLPVCSLGPGQVGHLNLGQD 

VSSLRFFAICGLQEGFEPFAINMQRPVTTWFS 

KGLPQFEPVPLEHPHYEVSRVDGTVDTPPCLR 

LTHRTWGSQNSLVEMLFLRLSLPVQFHQHFR 

CTAGATPLAPPGLQPPAEDEARAAEPDPDYE 

NLRRSAGGWSEAENGKEGTAKEGAPGGTPQ 

AGGEAQPARAENEKX)ATTEKNKKRGFLFKA 

KKVAMMTQPPATPTLPRLPHDWPADNRDD 

PEIILNTTTYYYSVRVFAGQEPSCVWAGWVT 

PDYHQHDMSFDLSKVRWTVTMGDEQGNV 

HSSLKCSNCYMVWGGDFVSPGQQGRISHTDL 

VTGCLVDLATGLMTFTANGKESNTFFQVEPN 

TKXFPAWVLPTHQNVIQFELGKQKNIMPLSA 

AMFQSERKNPAPQCPPRLEMQMLMPVSWSR 

MPNHFL Q VETRRAG ERL G W A VQCQEPLTMM 

ALHIPEENRCMDILELSERLDLQRFHSHTLRL 

YRAVCALGNNRVAHALCSHVDQAQLLHALE 

DAHLPGPLRAGYYDLLISIHLESACRSRRSML 

SEYIVPLTPETRAITLFPPGRSTENGHPRHGLP 

GVGVTTSLRPPHHFSPPCFVAALPAAGAAEAP 

ARLSPAIPLEALRDKALRMLGEAVRDGGQHA 

RDPVGASVEFQFVPVLKLVSTLLVMGIFGDE 

DVKQILKMIEPEVFTEEEEEEDEEEEGEEEDEE 

EKEEDEEETAQEKEDEEKEEEEAAEGEKEEG 

LEEGLLQMKLPESVKLQMCHLLEYFCDQELQ 

HRVESLAAFAERYVDKLQANQRSRYGLLDCA 

FSMTAAETARRTREFRSPPQEQINMLLQFKDG 

TDFFnPPLPEEIRODLLDFHQDLLAHCGIQLD 

GEEEEPEEETTLGSRLMSLLEKVRLVKKKEEK 

PEEERSAEESKPRSLQELVSHMWRWAQEDF 

VQSPELATRAMFSLLHRQYDGLGELLRALPRA 

YTISPSSVEDTMSLLECLGQIRSLLIVQMGPQE 

ENLM1QS1GN7MNNKVFYQHPNLMRALGM1IE 

TVN1EVMVNVLGGGESK£IRFPKMVTSCCRFL 
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Y=Tyrosinc, X-Unknown, *-Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 














CYFCRISRQNQRSMFDHLSY LLENSGIGLGM 

OGSTPLDVAAASVIDNNELALALQEQDLEKV 

VSYLAGCGLQSCPMLVAKGYPDIGWKPCGG 

ERYLDFLRFAVFVNGESVEENANWVRLLIR 

KPECFGPALRGEGGSGLLAA1EEAIRISEDPAR 

IX3PGIRRDRRREHFGEEPPEENRVHLGHAIMS 

FYAALIDLLGRCAPEMHLIQAGKGEALRIRA1 

LRSLVPLEDLVGnSLPLQIPTLGKDGALVQPK 

MSASFVPDHKASMVLFLDRVYGIENQDFLLH 

VLDVGFLPDMRAAASLDTATFSTTEMALAV 

NRYLCLAVLPLITKCAPLFAGTEHRAIMVDS 

MLHTVYRLSRGRSLTKAQRDV1EDCLMSLCR 

YIRPSMLQHLLRRLVFDVPILNEFAKMPLKLL 

TNHYERCWKYYCLPTGWANFGVTSEEELHL 

TRKLFWGIFDSLAHKKYDPELYRMAMPCLC 

AIAGALPPDYVDASYSSKAEKKATVDAEGNF 

DPRPVETLNVIIPEKLDSFINKFAEYTHEKWAF 

DKIQNNWSYGENIDEELKTHPMLRPYKTFSE 

KDK£rYRWPIKESLKAMIAWEWTIEKAREGE 

EEKTEKKKTAKISQSAQTYDPREGYNPQPPDL 

SAVTLSRELQAMAEQLAENYHNTWGRKKKQ 

ELEAKGGGTHPLLVPYDTLTAKEKARDREKA 

QELLKJFLQMNGYAVTRGLKDMELDSSSIEKR 

F AFGFLQQLLRWMD1SQEFIAHLEAVVSSGRV 

EKSPHEQEIKFFAKlLLPLrNQYFTNHCLYPLS 

TPAKVLGSGGHASNKEKEMITSLFCKLAALV 

RHRVSLFGTDAPAVVNCLHILARSLDARTVM 

KSGPEIVKAGLRSFFESASED1EKMVENLRLG 

KVSQARTQVKGVGQNLTYTTVALLPVLTTLF 

QH1AQHQFGDDVTLDDVQVSCYRTLCS1YSLG 

TTKNTYVEKLRPALGECLARLAAAMPVAFLE 

PQLNEYNACSVYTTKSPRERAILGLPNSVEEM 

CPDIPVLERLMAD1GGLAESGARYTEMPHV1E 

ITLPMLCSYLPRWWERGPEAPPSALPAGAPPP 

CTAVTSDHLNSLLGNILR1IVNNLGIDEASWM 

KRLAVFAQPIVSRARPELLQSHFIPTIGRLRKR 

AGKWSEEEQLALEAKAEAQEGELLVRDEFS 

VLCRDLYALYPLLIRYVDNNRAQWLTEPNPS 

AEELFRMVGEIFIYWSKSHNFKREEQNFWQ 

NErNNMSFLTADNKSKMAKAGDIQSGGSDQE 

RTKKJOUIGDRYSVQTSLIVATLKKMLPIGLN 

MCAPTDQDLrTLAKTRYALKDTDEEVREFLH 

NNLHLQGKVEGSPSLRWQMALYRGVPGREE 

D ADDPEKI VRRVQE V S AVL Y YLDQTEHP YKS 

KKAVWHKLLSKQRRRAVVACFRMTPLYNLP 

THRACNMFLESYKAAWTLTEDHSFEDRMIDD 

LSKAGEQEEEEEEVEEKKPDPLHQLVLHFSRT 

ALTEKSKXDEDYLYMAYADIMAKSCHLEEG 

GENGE AEEE VE V S FEEKQMEKQRLL YQQ ARL 

HTRG AAEMVLQMI S ACKGETG AMVSSTLKL 

GIS1LNGGNAEVQQKMLDYLKDKKEVGFFQS 

IQALMQTCSVLDLNAFERQNKAEGLGMVNE 

DGTVTNRQNGEKVMADDEFTQDLFRFLQLLC 

irr-uTviMnFONYl RTOTGNTTTIhQIICTVDYLL 

RLOESISDFYWYYSGKDVIEEQGKRNFSKAM 

SVAKQVFNSLTEYIQGPCTGNQQSLAHSRLW 

DAWGFLHVFAHMMMKLAQDSSQIELLKEL 

LDLQKDMVVMLLSLLEGNVVNGMIARQMV 

DML VES S SN VEMILKF FDMFLKLKDIVGSE AF 

QDYVTDPRGLISKKDFQKAMDSQKQFSGPE1 
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SEQID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQID 1 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 
beginning 

location 
correspondi 
ng to first 
amino acid 
residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last ammo 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, 
D-Aspartic Acid, E-GUitamic Acid, 
F=Phenylalanine, OOlycine, H=Histidine, 
l=Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y-Tyrosine, X=Unknown, *«Stop codon, 
/=possiblc nucleotide deletion, V=possible 
nucleotide insertion 








u 






QFLLSCSEADENEMINCEEFANRFQEPAKDIO 

FNV A VLLTNL SEHVPHDPRLHNFLELAES ILE 

YFRPYLGRIEIMGASRRIERIYFEISETNRAQW 

EMPQVKESKRQFIFDWNEGGEAEKMELFVS 

FCEDTIFEMQ1AAQISEPEGEPETDEDEGAGA 

AEAGAEGAEEGAAGLEGTAATAAAGATARV 

VAAAGRALRGLSYRSLRRRVRRLRRLTAREA 

ATAV AALL W AAVTRAGAAG AG AAAG ALGL 

LWGSLFGGGLVEGAKKVTVTELLAGMPDPT 

SDEVHGEQPAGPGGDAJPGEGASEGAGDAAE 

GAGDEEEAVHEAGPGGADGAVAVTDGGPFR 

PEGAGGLGDMGDTTPAEPPTPEGSP1LKRKLG 

VDGVEEELPPEPEPEPEPELEPEKADAENGEK 

EEVPEPTPEPPKKQAPPSPPPKKEEAGGEFWG 

ELEVQRVKFLNYLSRNFYTLRFLALFLAFAIN 

FILLFYKVSDSPPGEDDMEGSAAGDVSGAGS 

GGSSGWGLGAGEEAEGDEDENMVYY FLEES 

TGYMEPALRCLSLLHTLVAFLCI1GYNCLKVP 

LVIFKREKELARKLEFDGLYTTEQPEDDDVKG 

Q WDRL VLNTPSFPSNY WDKFVKRK VLD1CH G 

DIYGRERIAELLGMDLATLEITAHNERKPNPP 

PGLLTWLMSIDVKYQIWKFGVIFTDNSFLYLG 

VTfMVNlSLLGHYNNFFFAAHLLDIAMGVKTL 

RTILSSVTHNGKQLVMTVGLLAWVYLYTW 

AFNFFRKFYNKSEDEDEPDMKCDDMMTCYL 

FHMYVGVRAGGGIGDEIEDPAGDEYELYRVV 

FDITFFFFVIVILLAIlQGLnDAFGELRDQQEQV 

KEDMETKCFICGIGSDYFDTTPHGFETHTLEE 

HNLAKYMFFLMYLINKDETEHTGQESYVWK 

MYOERCWDFFPAGDCFRKQYEDQLS 


501 


1851 


A 


3869 


467 


665 


VIVAIYCQLIFDKGAKTIQ*PFQQIAL/CKRMK 
LGPCFTPCGKINSEWIRELSVRVKTIKHLEIGV 

N 


502 


1852 


A 


3888 


1042 


724 


" SGMQ WRDLTPLQPLPPRFKQFSCLSLPGSWD 
YRHAP\PLLTNF\*FLVEMGFCYVGQAGRKLL 
ASSDQSALASQSAGITGISTAPGPPFFFLNFEA 
GSCSVAQAGVQ 


503 


1853 


A 


3891 


1773 


1193 


EVDSQSGVQ'QAPGSLQLQTPGLK/VSCLLSK 

QD YRSSLPHL ASCC YYYYYY/VFL* RRGLTTL 
VQGGLKLLPSSNPFASAP*TAGITGMSHCAGP 
HFNF*MFRKJSCIRE*F*HTRJYT)IPFL1LFFKET 
WVLLCYPGWPQIPGLKPSSCLRLLSSWDHRC 
APPCPASFFIFHVDRVSPPCPGLVSITFKMLLL 

L 


504 


1854 


B 


3896 


279 


70 


" MV SKSKSILMS YNHVELTFSDMKKMPEAFRK 
TQKHTIYLIPYQVIFWSTGKDAMRSFMMPFY 
QKEYYENQ* 


505 


1855 


A 


3899 


2 


1396 

! 


" 'EPGVPTKKTWFDKPDFNRTNSPGFQKKVQhCj 
NENTKLELRKVPPELNNI SKXNEHFSRFGTL V 
NLQVAYNGDPEGALIQFATYEEAKKAISSTEA 
VLNNRFIKVYWHREGSTQQLQTTSPKVMQPL 
VQQPILPWKQSVKERLGPVPSSTIEPAEAQS 

ASSDLPQVLS 1 \JLLA ¥ i s «jkv^^v^ l ' " rs ^ vn> V IV 1 

LLVSTSAVDNNEAQKKKQEALKXQQDVRKR 

KQEILEKHIETQKMLISKLEKNKTMKSEDKAE 

TMKTLEVLTKNTTKLKDEVKAASPGRCLPKSI 

KTKTQMQKELLDTELDLYKKMQAGEEVTEL 

RRKYTELQLEAAKRGILSSGRGRGIHSRGRGA 

VHGRGRGRGRGRGVPGHAWDHRPRALEIS 
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SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 



SEQ ID 
NO: of 
peptide 
seq- 
uence 



Met 
hod 





Predicted 


ID NO: 


Deginning 


in 


nucleotide 


USSN 


location 


09/496 


correspond! 


914 


ng to first 




amino acid 




residue of 




peptide 




sequence 



1856 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 



507 



1857 A 



508 1858 



509 



3936 439 



A 3944 



1859 



Amino acid sequence (A=Aianine C=Cysteine, 

D=Aspartic Acid, E=Glutamic Acid, 

F=Phenylaianine, Glycine, H==Histidine, 

l=lsoleucine, K=Lysine, L=Leucine. 

M=Methionine, N=Asparagine, P=Prohne, 

Q=G lutamine, R=Arginine, S=Serine, 

T=Threonine, V=Valine, W-Tryptophan, 

Y=Tyrosine, X»Unknown, *=Stop codon, 

/=possible nucleotide deletion, V=possible 

nucleotid e insertion ,,. 

AFTESDREDLLPHFAQYGElEDCQLJDUSibLHA 

r^T,^^ * t- a t? a a a \ rur: a ufkrn DDI XI .AWN 



18 



120 



3949 



512 



1862 




513 



1863 



3957 



31 



412 



AFTESDREDLLPHFAg Y (jtAtu^^w* 
V1TFKTRAEAEAAAVHGARFKGQDLKLAWN 

v P VTN1 S A VE TEEVEPDEEEQREniA 

"DAELSGTLSLVLTQCCKXlKU'l VQKLASDHK 
D1HSSVSRVGKAIDKNFDSDISSVGIDGCWQA 
DSQRLLN E VMVEHFFRQGMLD V AEELCQE S 
GLSVDPSQKEPFVELNRILEALKVRVLRPALE 
WAVSNREMLIAQNSSLEFKLHRLYFISLLMG 
GTTNQREALQYAKNFQPFALNHQKDIQVLM 
GSLVYLRQG1ENSPYVHLLDANQWADICD1FT 
RDACALLGLSVESPLSVSFSAGCVALPALINIK 
AVIEORQCTGVWNQKDELPIEVXDLG* KS AGY 
HSIFACPILRQQTTDNNPPMKLVCGHIISRDAL 

MKMFNOS KLKCPYCPMEQSPGDAKQgF 

" SHPFSPAPGICPDAPPPLPRPSKGLLiHPGTAGA 
PGSGARCHPPSTCSPSWASPG'GAKASPALPR 
SHGVTLLCKAQAHLCRGEDSKDASGSTSQA 
WEPG* G A WGMPRCQGPALGSCFCPPGTTVQ 

RPAKQRDKRNRHLGR 

WCPAGTLD FPGPQEMVLLEIEVMNQLNHRNL 

. ^mrojxn CV/fTAVFrPK* W* GLGGGT 



392 



1086 



3961 



WCPAG 1 LUrrurV^ 1 ™ VLLL1£ ' v 1V1J,V < 1J1 ^ 

IQL YAAIETPHEIVLFMENYECPK* W GLGGGT 
TRHGASRPGVCAHSIEGGELFERIVDEDYHLT 

T^TfFSPREKGRGVLSVLL^*KLRVll^I<aP 

^ mTT rvnfA\u;TP.n«PMTI * TCF.ORG 



MVFFLQNFC/RIILN VA\WTGD *PNTL*KEQRG 
ITFSDSKS*YKATKIKTMWYCHKNRYID/ERN 

RTFJPE INPCICDKJIFRKLSMTTQ 

FSETRACCPRL EHSGRIEAHCSLN1PUSSUPPT 



3038 



476 



PEPEGEVGPPRV1 lbKfaKULrnr^u^ - 
LHPLLCLRHHPLPHLIPTGPHRLKRPRM\P\SP 
MAALILVADNAGGSHASKDANQVHSTTRRN 
SNSPPSPSSMNQRRLGPREVGGQGAGNTGGL 
EP VHPASLPD S SLATS APLCCTLCHERLEDTH 
FVOCPSVPSHKFCFPCSRQSIKQQGASGEVYC 
PSGEKCPLVGSNVPWAFMQGEIATILAGDVK 

VKJCERDS 

"0*DRARLDCSSATSAHCNLRLPGS*DSPAbA3K 
VAGTTDTHHHTWLILGSSVQTGFDHVGQAG 
LELLTSGDPPISASESAGIMGMSHCVWP*SWG 
LSHHMAPPQGDGGRARGTPGPEQSFWNLSC 
H* PRCO VPS* LN1TQ1VFWGRHQ YNPTMKRGK 
LRHREACSLPLPGEGEPGLQPSS\*SQNPCSSPL 
FHHGL*AWLWCPELLLQGQARRH*RSPPS/FK 
CPATLSLTAWSQTKRLRSQ^LLPWL*RAL*H 
PP\CH WPSRRSLGDPLLPRSQG * RDGT* ASTFC 
S YF* DTESHL V AQ AG VQ WRDLGSLQPPCPRL 
K\RFSRLSPPSSYTHRYVPSHLAESaSSRDRIP 

PSRPPRSR NSNSLSR r 

"VALTTSMCCNKQV1VTDKIKSAS1ADRCGALH 

VGDHXLSiDGTSMEYCn-AEATQFLAKTTDQ 
VKLEILPHHQTRLALKGPDHVKIQRSDRQLT 
WD S W ASNH S S LHTNHHYNTYHPDH CR VP AL 
TFPKAPPPNSPPALVSSSFSPTSMSAYSLSSLN 
MGTLPRSLYSTSPRGTMMRRRLKKKDFKSSL 
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SEQID 
NO: of 
nucl- 
eotide 
seq- 
uence 



SEQ ID 
NO: of 
peptide 
seq- 
uence 



Met 
hod 




514 



515 



516 



1864 



Predicted 
beginning 
nucleotide 
location 
correspond i 
ng to first 
amino acid 
residue of 
peptide 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 



A 3967 



1865 



1866 



3969 



A 3977 



517 



518 



519 



1867 



1868 



1869 



833 



492 



800 



182 



A 



3980 



3986 



3994 



1358 



974 



751 



1357 



1022 



666 



126 



Amino acid sequence (A=Alanine OCysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, OOlycine, H=Histidine, 
I«=lsoleucine, K-Lysine, L=Leucine, 
M=Methioninc, N-Asparaginc, PHProline, 
Q=Glutamine, R=Arginine, S=Serine, 
^Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, ♦=Stop codon, 
/=possible nucleotide deletion, \=possible 

nucl eotide insertion ____ 

SLASSTVGLAGQWHTETTEWLTADPVTCil*' 
GIQLQGSVFATETLSSPPLISYIEADSPAERCG 
VLQIGDRVMAINGIPTEDSTFEEASQLLRDSSI 
TSKVTLEIEFDVAESVIPSSGTFHVKLPKKHN 
VELGITISSPSSRKPGDPLV1SDIKKGSVAHRT 
GTLELGDKXLAIDNIRLDNCSMEDAVQILQQC 
EDL VKLKIRKDEDNSDEQES SG A1TYTVELKR 
YGGPLG\ITISGTEEP\FDL*nSSLTKGGLAERT 
GAIH1GDR1L\AINSSSLKGKPLSEAIHLLQMAG 
ETVTLKIKKQTDAQSASSPKKFPISSHLSDLGD 
VEEDS SP AQKPGKLSDMYPSHGCPS VDS AVD 
SWrXjSA\mTS\YGTEGT^SFQASGYWNTYD 
WRSPKQRGS\LSPVT\KPRSQTYPDVGLSYED 
WDRSTASGFAGAA\DSAETEQEENFWSQALE 
DLETCGQSGILRELEATIMSGSTMSLNHEAPT 
PRSPAGSDRPSFQERSSSRPHYSQTTRSNTLPS 
DVGRKSVTLRKMKQEIKEIMSPTPVELHKVT 
LYKDSDMEDFGFSVAIXiLLEKGVYVKNIRPA 
GPGDLGGLKPYDRLLQVNHVRTRDFDCCLV 
WLIAESGNKLDLV1SRNPLASQKSIDQQSLPG 
D*SEQNSAFFQQPSHGGNLETREPTNTL 



LEKQGVSGMATKRLARQLGLIRRKS1APANU 
NLGRSKSKQLFDYLIVIDFESTCWNDGKHHH 
SQEHEFPAVLLNTSTGQIDSEFQAYVQPQEHPI 
LSEFCMELTGIKQAQVDEGVPLKICLSQFCK 
W1HKIQQQKNHFATGISEPS/DF* SKIMCICYL 

VR»RI SYTY*SKHKSKGC . 

CRFWGISTHCDTCDPLSPQTTEG**EGDLWSL 
DLLGPEFLARKPLFKTKTYQSTF*SISKNE/FTC 
PNFIIEEGTDLI F\* QV KHNPCHRLTPEEGTVQL 

NRADS 

ICMLC/QKESNYIRLKRAKMDKSMFViOKllAil 
G AFGE VCL ARK VDTKAL Y ATKTLRKKD VLL 
RNQVAHVKAERDILAEADNEWVVRLYYSFQ 
DKDNLYFVMDYIPGGDMMSLLIRMGIFPESL 
ARFYIAELTCAVESVHKMGFIHRDIKPDMLID 
RDGHIKLTDFGLCTGFRWTHDSKYYQSGDHP 
RQDSMDFSNEWGDPSSCRCGDRLKPLERRAA 
RQHQRCLAHSLVGTPNYIAPEVLLRTGYTQL 
CDWWSVGVILFEMLVGQPPFLAQTPLETQM 
KVINWQTSLHIPPQAKLSPEASDLIIKLCRGPE 
DRLGKNGADEIKAHPIF + NQFDFSQ*PEDSRS 
AFKQFP*NHTTPTDTSNFDP\VDPDKLWSDDN 
EEENVNDTLNGWYKNGKHPEHAFYEFTFRRF 
FDDNGYPYNYPKPIEYEY1NSQGSEQQSDEDD 
QNTGSEIKKRDLVYV 



KjyS l \JOE>.UVM iu^jw _ 

FFFKJCFTQSLGFLLFSFSFLFSCFFFFHFVLHJY 
VFLDRVPLCHPGWSAVVQSQVTAO>JLPPSWD 
♦RCRPPH/LANLCNFCRD^SFTTLPRLVLNTWA 

OAIFO PQPPKVLGLQV . 

SPEMESHPITQAGVQWHHLSSLQPLPPUFK'h 
SCFSLPE*LGYRHVPPCLANSVFSVEMG\FLH 
VGQAGLELLTSGDLPALASQSAGITGVSHRAR 

PENGFENIF 

NQGLRHVGLCRTCLVNQMF'ASSILGKSHHHS 

LISINQGHNALWKAAG\PLPLKAGYOQSFSPC 

DSLKYG\SWDEKI)LTVPQRDTHKRSVLRWIS 

QRGKVLAVEMEEGHCLLVLPLGTECLGIK\PIV 

HLFSSEMGE^NRPMVGVMIHVYSNAALLSFTP 
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SEQ ID ! 
NO: of 1 
nucl- 
eotide 
seq- 
uence 


SEQ ID f 
NO: of \ 
peptide 
seq- 
uence 


vlet J 
rod ] 


SEQ 1 
lD NO: I 
n J 
USSN 

Jy/^yO 

914 


Predicted ] 
beginning 

nucleotide 

ocation 
correspond i 
ng to first 
amino acid 
residue of 
net) tide 
sequence 


Predicted end 
nucleotide 
ocation 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, 
D*Aspartic Acid, E-Glutamic Acid, 
F=Phenylalaninc, G=Glycine, H=Histidine, 
NIsoleucine, K=Lysine, L=Leuctne, 
M=Methionine, N=Asparagine, P=Proline, 
Q^Glutamine, R=Arginine, S=Serine, 
T=Threonine, V-Valine, W=Tryptophan, 
Y=Tyrosine, X-=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 














LRCLGGEKHKSGLHARPVlVPSLELHYDMJJbl 
AHWFADLLLIITLPSYY1PFC 


520 


1870 


A 




882 


698 


QSFRLSLLSSWDYRHM*PRLANF*T\FFCKiJK/ 
SLALLPRLVSNSWPQAILPPRPPKVLGLQT 


521 


1871 


A 


4011 


1346 


1178 


FFP*ETVSCSAS*AGVRSHDNSSLQPPSP^5iN 
PPTSASHVAGATGTHHHAWLLSV 


522 


1872 


A 


4015 


i 


377 


QG1ALLTRMGES VKH VTGG Y RLRTRPLEFAA 
IGD YLDTF ALKLGTIDR1AQRI EKEEIEYL VELR 
EYGPVYSTWSALEGELAEPLEGVSACIGNCST 
AL* ELTDDMTEDFLF VLREYIL Y SD SMK 


523 


1873 


A 


4018 


n A i 

341 


19 
i y 


ERVIHNQ1QQ AQRSPHIFN ARRSS/PRFN 1 v 

KVKEVCKTSKS/GQVIYKGVSIRLRANFLAEP 

L*NRREWDEAIKVLKEKQ\FLSKMVYPANLSF 

GNEGDITSFPAK 


524 


1874 


A 


4020 


1067 


743 


FFLRWSUDSVAQAGVKWCNLGSLQAP^KUl- 
TPFSCLSLPSSWDYRHPPPRLAN*LTNFLCF** 
RQGFTVL ARMVLIS * PHDLP AS ASQS AG1TGL 
SHCSWPTSSILS 


525 


1875 


A 


4021 


781 


j j i 


QFRVlFFFLRRSHSVAQAGMQWHDHSLLgi-L 

PPRLKQ/F/SHLSPPS1WDYRRVPPCLVNFSIFF 

VETGSCQPCLQLLGSSNPPASASQSAGIAGISH 

QGQPE+SFDIRFACVIAALRETFQCLCSASRVN 

NK1IKRPTHPVESSF 


526 


1876 


A 


4024 


80 


341 


"TPSSTSRGTEEQQSSKMAWQRREEKEHLNYK 
RSSAEDGWKADKP/VDG*TPGEDHLPTPSPFQ 
LHIHSSESQLHHSVKSPPSLSFRLM 


527 


1877 


A 


4026 


593 


230 


DF YLYPERKKRGQMMTAVSLTTRPQbS v Al-fc 
DVAVYFTTKEWAIMG\PAERALYRDVMLEN 
YGGCGPL*CHPTSKPALVFS\LEQGKESCFSPA 
TGSSLSRNDWRAGWIGYLELRRYTYLS 


528 


1878 


A 


4028 


1160 


242 


" GTSELLCIQRWNWGPAFPPRPGLALAP 1 LQLL 
VEMGSAKSVPVTPARPPPHNKHLARVADPRS 
PSAGILRTPIQVESSPQPGLPAGEQLEGLKHAQ 
DSDPRSPTLGIARTPMKTSSGDPPSPLVKQLSE 
VFETEDSKSNLPPEPVLPPEAPLSSELDLPLGT 
QLSVEEQMPPWNQTEFPSKQVFSKEEARQPT 
ETPVASQSSDFCPSRDPETPRSS\GSMRNRWKP\ 
NS SK VLVGKSPLHPSCQDDNSPGTLTLRQGKA 
AFKPLSENVSELK\EGA\ILGTGR\LLKTEGRA 
WEOGODXHDKENQHFPLVES 


529 


1879 


A 


4039 


o 

/ 


366 


■ KDMVLIMEMQSMTTMKCPQ YL*E*RKIFD11 R 
CW*GCGSTGEJFC/WS*PL*KTI*QPR*FKQl*T 
ILTIIYSIM*EHTFHNAGV*LSDIYPRFMKGYV 

HTE1CT* MFIA VLF VWKTWKQF 


530 


1880 


A 


4<JD / 


J JO 


3 


" " LLE VNGNTIVTVFTKAQNKKNKG S RS ILh K.QL 
RKYGSRINLLKSKHDKNICTENYKT* MKEIEA 
/DTDKWKJDILCSWIRRJHMKDIECSWIGRTHV 
VKJSILPKVNYRFYLISIK1IMAI 


531 


1881 


A 


A(\A] 
*+\JO 1 


50 


278 


" TQGTEEIYKISSCEWVQASFSTPLITLHDFKIY 
HKATVTKM V WYWHRQ* KESKN/RIESSEIEPH 
IYDQFIFDKGEKJIQEKGNSFFNN/MCWKNW1F 


532 


1882 

1 

1 


A 


4069 


19 


368 

1 


— xfnTT FNFKF\VE*FKE* LENIN GTVTEKE 1 GOV 
YK£LSSPKYSGTRQFYGQTISNFPGKIlSM\ r Y 
KXFQNTEmSGRfiPISLYEFRrTLITIPNKDNIYL 
OIWMPVSLMNTVTLKCPT 


533 


| 1883 

! 


A 


4076 


1 


355 


P1RKFTKV AG»KSNTPK* LAFLHINNEQF fcN W/ 
ITNlyPFIlASKRIKYSGISLTKEMKDLYTErLLR 
KJKEDTNKWKDI/SCTWVGR/LNIVTCMPKAnC 
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SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 1 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 
beginning 
nucleotide 
location 
correspond i 
ng to first 
amino acid 
residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A= Alanine C=Cysteine, 
I>Aspartic Acid, E=Glutamic Acid, 
F=Phenyialanine, G=Glycine, H=Histidine, 
l = lSOieucine, n— Lysine, l- l^cuuihc^ 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T-Threonine, V=Valine, V/=Tryptophan 5 
Y-Tyrosine, X=Unknown, *-Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 

IFNAIPIKMPMMCMAKIEKNSS 1 


534 


1884 


A 


4088 


3 


1931 


IIDSSTRRMESERSPLYRQLIDLGYLSSSHWNC 

GAPGQDTKAQSMLVEQSEKLRHLSTFSHQVL 

QTRLVDAAKALNLVHCHCLDIFINQAFDMQR 

DLQrTPKRLEYTRKKENELYESLMNIANRKQE 

EMKI)MIVETLKrMK£ELLDDATNMEFKDVI 

VPENGEPVGTREIKCCIRQIQELnSRLNQAVA 

NKLISSVDYLRESFVGTLERCLQSLEKSQDVS 

VHITSNYLKQILNAAYHVEVTFHSGSSVTRM 

LWEQIKQ1IQRTTWVSPPAITLEWKRKVAQEAI 

ESLSASKLAKSICSQFRTRLNSSHEAFAASLRQ 

LEAGHSGRLEKTEDLWLRVRKDHAPRLARLS 

LESRSLQDVLLHRKPKLGQELGRGQYGWYL 

CDNWGGHFPCALKSVVPPDEKHWNDLALEF 

HYMRSLPKHERLVDLHGSVIDYNYGGGSSIA 

VLLIMERLHRDLYTGLKAGLTLETRLQIALDV 

VEGIRFLHSQGLVTIRDIKLKNVLLDKQNRAKI 

TDLGFCKPEAMMSGSIVGTPIHMAPELFTGK 

YDN S VD VY AFGILF WYI CSGS VKLPE AFERC A 

SKDHLWNNVRRGARPERLPVFDEECWQLME 

ACWDGDPLKRPLLGrVQPMLQGIMNRLCKS\ 

NSEQPNRGLDDST 


535 


1885 


A 




n 
L 


417 


ALMPHEANYEEIFLKTDKJDMDGFESGLEVKh 
IFLKTR/GLPSTLLAHIWALCDSKDCGKLSKD 
HFALAFHLmQKLIKGIDPPLVLTPEKISPSNR 
ASLQKVTELTRKPVCIIFKGTIL WRITDS IWMK 
HNRKRTWLRA 


536 


1886 


A 


a i rn 




829 


" DHQK'KNIPCSWIGRINIVKMSILPKAIYRFSAI 
PIKIPMTFFTEI*S*NVYRTTKTQE*AKAILSKK 
EQNLEESHYLDFK* YYRA V . 


537 


loo/ 


A 

A 


4104 


54 


281 


SIDCEHLIRRMLVLDPSKRLT1AQIKEHKWML 
IEVPVQRPVLYPQEQENEPSIGEFNEQVLRLM 
HSLGIDQQKTIE 


538 


1888 


A 


4109 


141 


314 


IRHCPLKIRSWSHLKCFYKFILTFFFAGCSgFL 
VPRENITAWMNAIGLnTALPVS 


539 


1889 


A 


4111 


268 


1 


" ASRPWGHSYP*FNQQEVDTLKRPIASSEI*MM 
I*KFAT\KKSPGPYRFTAEFSHTFKEDLVPILW 
PLFPKJYREGTLPHSFYEASITL 


540 
541 


1890 
1891 


A 

A 


4142 
4146 


198 
282 


2064 
778 


PEPGAGRAATPWGPLFWRGRGSGRCEKAAE 

AALGDFLGLHRRTQQPAVDRLLSDASAQWR 

VRGHGGVRESGRAPQQPGRRRGRRPRKRPR 

GR WRREGCG AGGRGVCV AA WSQRSIAGNN 

DYRLFHKMSNSHPLRPrTAVGEIDHVHILSEH 

IGALLIGEEYGDVTFVVEKKRFPAHRVILAAR 

CQYFRALLYGGMRESQPEAEIPLQDTTAEAFT 

MLLKYIYTGRATLTDEKEEVLLDFLSLAHKY 

GFPELEDSrTSEYLCTILNIQNVCMTFDVASLY 

SLPKLTCMCCMFMDRNAQEVLSSEGFLSLSK 

TALLNIVLRDSFAAPEKDIFLALLNWCKHNSFC 

ENHAELMQAVRLPLMSLTELLNWRPSGLLSP 

DAILDAIKVRSESRDMDLNYRGMLIPEENIAT 

MKYGAQWKGELKSALLDGDTQNYDLDHG 

F SRFtPIDDDCRSGLElfvLuQr o llfi ri v killwuk 

DSRSYSYFIEVSMDELDWVRVIDHSQYLCRS 

WQKLYFPARVCRYIWVGTHNTVNKIFHTVAF 

ECMFTNKTFTLEKGLIVPMENVATIADCASVI 

EGVSRSRNALLNGDTKNYDWDSGYTCHQLG 

SGAIWQLAQPYMIGSIRVLLWDCDDRSY 

GTLGYFNGARGQPQDNFFAHQWSHHPPISAC 
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D=Aspartic Acid, E=Glutaraic Acid, 
F=PhenylaJanine, G=Glycine, H=Histidine, 
l=lsoleucine, K=Lysine, b=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
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HAESENFAF WQDMKWKNKFWGKSLE1 VP V <J 
TVNVSLPRf GDHFEWNKVTSCIHNVLSGQRW 
IEHYGEVLIRNTQDSSCHCKITFCKAKYWSSN 
VHEVQGAVLSRSGRVLHRLFGKWHEGLYRG 
PTPGGQCIWKP 


542 


1892 


A 


4147 


44 


433 


S VD A Y VCNDIVFS YRTTJTLLEGA* LTHR Y VA 
QDPICQGQLRSLHLTCDSAPAGSQGTWSTSCR 
INHLIFRGGAQITFLATFDDSPKAVLGDRLLLT 
ANVSSENNTPRTSKTTFQLELSVKDAVYTVV 

SSH 


543 


1893 


A 


4153 


678 


11 


TISYPQCLTQMYFLISFANVDTFLLPIMALDH 

YVAICSALQ*CSIITP/ELCQGLPVLA*AGSSLIS 

PVHTVIMSRLAFCSSAQISHFYRDAYLLMKIA 

CSHT*\NQHVFLGAWLFLAPCAL1LVSYIRIA 

AAILRIPSPTRRRKACSICSSHLSLVTLFYGTV 

LGICI * PPDSF S AQDAIATIMYTVVTSMLNPF1Y 

SLMNICEVQEAVRJRLFSRGSHSSWCW 


544 


1894 


A 


4158 


3 


538 


LLYAQAGVQ*LNLSSLQPQPAGLKQSSHPSLP 
SSWDYRYSTPHPANFFVEMEFHHVAQAGLEL 
LGSGDLPTSTSHSAGITGV\SHHAPPRLISSEGS 
1 T GHI LCLPMVFPLLC VF VLI SS S L AGEEAAG 
LRVQKLWPAVVLSHLPVCWFHCSGIWSEVIE 
LKVGREGHVLPWQAHWEF 


545 


1895 


A 


4160 


1 


412 


HPLGLGLVPSEIFSPQDKKAADGS1LAPARGE 

DLEAGLKGSFMDGRLQASVSVFRIQRVGSAM 

QDTASAMPCLPYYPTSHCFMAGGKSRSQGW 

ELELSGEPAPGWQVLAGYTYTQARYLRDASE 

ANVGQPLRPVDPR 


546 


1896 


A 


4174 


1252 


1190 


FFQVFIFLFLIFFKTEFHSCCPGAVQWHDLDSL 
QPPPPRFKGFSCLSLPSSWDYRHAPAHPANFV 
FLVETGFLJiV\GQ\ASLELPTSGDTPAS\ASQSA 
GITG VS HHA * PRASGRRC W 


547 


1897 


A 


4176 


3029 


1 


AGPDGLAAPASCQGARGQTRVPGAFS WLAi* 

GSHHASEGLAPGVPPAGGVSAQELTAPPQEG 

WGLGAPPAAPRPESDEKRAGSDAVRSFSRGA 

RDSLGQRRLGGTRGAGPAGKGAQRTMGPAS 

GFHSFPPRPHQEPSPRSSCWQHLLWHCPWPQ 

PSRLPRLTP AQLLQGPGVLAAPPGP* HVPGFL 

AQSPWPLPSGPRSP*DPLHQGALVPLPQGGSP 

HTAPHCLPSVLSPAIQQPLLPTAST/SSRSPPAS 

TMAPIPSALAVWEPAGSSPQLSSAPADSS\PLP 

ALPKVLPPWTQKPLLGCLCQSPLPLLSPPDQI/ 

RCPPACSPAAASSFSFESQPCPSAPSKASPAPA 

ALUVGPHHPP* SQQPQSQSVHPHGPGGPQPPL 

AAS SLFWMFCQPPPPHPQFL WHRPLPVTGKA 

LAS\PLCFRPAPGSLRQTPLPPQFHIPRPGLSAP/ 

PPPASGTSDSSDSRSPSASAARVWPPA\SPPPP 

AARHRPHPPEYFLSPCPFSCGFPRLLGRPRRPQ 

ALQTPRAWDLPPGSSPAPLCSGPELP'APPPLP 

PFPRVA*LGSGHPPSAQVPGLW*RCV*GHPIP 

RPVGHS*SGPPHSPPL*APPQAWPLELPPSRQC 

LQPLHLRAAQPLDPCCSLSPPGPPLPVPALPS 

WPGRP* SPSPASSQPPYHAGLPGPQSSPLPPGL 

PQLPSLRSGSQQPLLFFQCPGPG A V WGKGSPQ 

PLSPHPPPP/ARTQTFPVASRSLSPGTAPYSVCL 

TPSRSASSLPEWLASSLPKIPQSSGS^LGPTSP 

MP*CFHRPSPPLP/LSSPFPA\LRPQAPQFPLHLP 

P* PP APSPGCPLPPLAQQHQPSPPSPHARSTLT 

PPLWPSLALLP*PLPPPPPVPSFSASLLCSLPAH 
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GTP ASPGLGRSCLGKPQ' l'LP Wl SFWPF bUKi-A 
PGTWQPW/PVSPAPLSCLSAWDPWELPSPQPQ 
VCSTAELPTSCLLSSPGP\PAFQPPRFGCL*GPP 
GPPGLPPLQSSLSFPPPPPPVPQPPAPPALQWG 

LHLPGGRTK 


548 


1898 


A 


4180 


2369 


844 


RTHREEDFQFILKGIARLLSNPLLQTYLPNIi 1 K 

KJOFHQELLVLFWKLCDFNKVGQPRGALQGD 

GEOLPQ*PGGRDSVRLRGVGQSCPSLELSPLG 

PSPHP*KFLFFVLKSSDVLD1LVPILFFLNDAR 

ADQSRVGLMHIGVFILLLLSGECNFGVRLNKP 

YSIRVPMDIPVFTGTHADLLIVWFHKIITSGHQ 

RLOPLFDCLLTIVVNVSPYLKSLSMVTANKLL 

HLLEAFSTTWFLFSAAQNHHLVFFLLEVFNN1 

10 YOFDGN SNL VY Al IRKRS IFHQL ANLPTDPP 

TIHKALQRRRRTPEPLSRTGSQGGAPPWRAPA 

PLPLQSQAPSRPVWWLLQALTS*PRSPRCQR 

MAPCGPWNLSPSRAWRMAARLRGSPARHGG 

SSGDRP/HSSASGQWSPTPEWVLSWKSKLPLQ 

TIMRLLQVLVPQVEKiCIDKGLTDESEILRFLQ 

HGTLVGLLPVPHPILIRKYQANSGTAMWFRT 

YMWGV1YLRNVDPPVWYDTDVKLFEIQRV 


549 


1899 


A 


4191 


858 \ 


321 


LPWQRLGVLLSRGKMAVTG WLESLRTAgK l 

ALLQDGRRKVHYLFPDGKEMAEEYDEKTSE 

LLVRKWRVKSALGAMGQWQLEVGDPAPLG 

AGNLGPELIKESNANPIFMRKDTKMSFQWRIR 

NLPYPKI)VYSVSVDQK£RCIIVRTT>4KKYYK 

KFSTPDLDRHQLPLDDALLSFAYTPTAP 


550 


1900 


A 


4192 


1 


1980 

i 


" IRHTGSDIAGVCG V/LLLSGPCGVGLDLDSRLL 
G AS AMRRSEVLAEES I VCLQKALNHLREIWE 
L1GIPEDQRLQRTEVVKKHIKELLDMM1AEEE 
SLKERLIKSISVCQKELNTLCSELHVEPFQEEG 
ETTILQLEKDLRTQVELMRXQKKERKQE\LKL 
LOEQDQELOEILCMPHYDIDSASVPSLEELNQ 
FRQHVTTLRETKASRREEF/V S SIKRQIILCtvIE 
ELDHTPDTSFERDWCEDEDAFCLSLEN1ATVL 
OKLLRQ\LEMQKSQNEAVCEG\LRTQI\R£LW 
DRLOIPEEEREAVATIMSGSKAKVRK\ALQ\LE 
VDRLEELEKCKTMKKVIEATKVELVQYWDQC 
FYSQEQRQAFAPFCAEDYTESLLQLHDAEIVR 
LKNYYEVHKELFEGVQKWEETWRLFLEFER 
KASDPNRFTNRGGNLLKEEKQRAiCLQKJvaP 
KLEEELKARIELWEQEHSKAFMVNGQKFME 
YVAEQWEMHRLEKERAKQERQLKNKKQTET 
EMLYGSAPRTPSKRRGLAPNTPGKARKLNTT 
TMSNATANSSIRPIFGGTVYHSPVSRLPPSGSK 
PVAASTCSGKKTPRTGRHGANKENLELNGSI 
LSGGYPGSAPLQRNFSrNSVASTYSEFADPSLS 
DSSTVGLQRELSKASKSDATSGILNSTN1QS 


551 


1901 


A 


4194 


3 


1008 


"XWlffiGL^SSPAIGAYLSASYGDSLVVLVAl'V 
VALLDICFILVAVPESLPEKMRPVSWGAQISW 
KQADPFASLKKVGKDSTVLLMCITVCLSYLPE 
AGVQYSSFFu^YLR\QVlGFG\TVTCIAAFIAMVGl 
LSIVAQTAFLSILMRSLGNKNTVLLGLGFQML 
QLAWYGFGSQAWMMWAAGTVAAMSSnTP 
AI SAL VSRN AESDQQG V AQGHTGIRGLCNGL 
GPALYGFIFYMFHVELTELGPKLNSNNVPLQ 
GAV1PGPPFLFGACIVLMSFLVALFIPEYSKAS 
G VQKHSN S S SGSLTNTPERG SDEDtEPLLQD S 
SIWELSSFEEPGNQCTEL 
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552 


1902 


A 


4197 


2 


14302 

1 
i 


ARPPPAPGSRQQKQKAAPGAAAAAELRGAR 

EP AP ARRRGTMADGG EGEDEIQFLRTDDE W 

LQCTATIHKEQQKLCLAAEGFGNRLCFLESTS 

NSK^TVPPDLSICTP/LEQSLSVRALQEMLANT 

VEKSEGQVDVEKWKJTvlMKTAQGGGHRTLL 

YGHAILLRHSYSGMYLCCLSTSRSSTDKLAFD 

VGLQEDTTGEACWWT1HPASKQRSEGEKVR 

VGDDLILVSVSSERYLHLSYGNGSLHVDAAF 

QQTLWSVAPISSGSEAAQGYLIGGDVLRLLH 

GHMDECLTVPSGEHGEEQRRTVHYEGGAVS 

VHARSLWRLETLRVAWSGSHIRWGQPFRLR 

HVTTGKYLSLMEDKKLLLMDKEKADVKSTA 

FTFRSSKEKLDVGVRKEVDGMGTSEIKYGDS 

VCYIQHVDTGLWLTYQSVDVKSVRMGSIQR 

KA1MHHEGHMDDGISLSRSQHEESRTARV1RS 

TVFLFNRPIRGLDALSKKAKASTVDLPIESVSL 

SLQDLIGYFHPPDEHLEHEDKQNRLRALKNR 

QNLFQEEGMINLVLECIDRLHVYSSAAHFAD 

VAGREAGESWKSILNSLYELLAALIRGNRKK 

CAQFSGSLDWLISRLERLEASSGILEVLHCVL 

VESPEALN1IKEGHIKSIISLLDKHGRNHKVLD 

VLCSLCVCHGVAVRSNQHLICDNLLPGRDLL 

LQTRLVNHVSSMRPNIFLGVSEGSAQYKKWY 

YELMVDHTEPFVTAEATHLRVGWASTEGYSP 

YPGGGEEWGGNGVGDDLFSYGFDGLHLWSG 

CIARTVSSPNQHLLRTDDVISCCLDLSAPSISF 

RINGQPVQGMFENFNIOGLFFPVVSFSAGLKV 

RFLLGGRHGEFKFLPPPGYAPCYEAVLPKEKL 

KVEHSREYKQERTYTRDLLGPTVSLTQAAFT 

PIPVDTSQIVLPPHLERIREKLAENIHELWVMN 

KJELGWQYGPVRDDNKRQHPCLVEFSKLPEQ 

ERhTYNLQMSLETLKTLLALGCHVGISDEHAE 

DKVKKMKLPKNYQLTSGYKPAPMDLSFIKLT 

PSQEAMVDKLAENAHNVWAJRDRJRQGWTY 

GIQQDVKNRRNPRLVPYTPLDDRTKKSNKDS 

LREAVRTLLGYGYNLEAPDQDHAARAEVCS 

GTGERFTUFRAEKXYAVKJU3RWYFEFETVTA 

GDMRVGWSRPGCQPDQELGSDERAFAFDGF 

KAQRWHQGNEHYGRSWQAGDWGCMVDM 

NEHTMMFTLNGEILLDDSGSELAFKDFDVGD 

GFEPVCSLGVAQVGRMNFGKDVSTLKYFTIC 

GLQEGYEPFAVNTrmDITMWLSKRLPQFLQV 

PSNHEHIEVTRIDGTIDSSPCLKVTQKSFGSQN 

SNTDIMFYRLSMPIECAEVFSKTVAGGLPGAG 

LFGPKNDLEDYDADSDFEVLMKTAHGHLVP 

DRVDKDK^TKPEHWHKDYAQEKPSRLKQ 

RFLLRRTKPDYSTSHSARLTEDVLADDRDDY 

DFLMQTSTYYYSVRIFPGQEPANVWVGWITS 

DFHQYDTGroLDRNnRTVTVTLGDEKGKVHE 

SIKRSNCYMVCAGESMSPGQGRNNNGLEIGC 

WDAASGLLTFIANGKELSTYYQVEPSTKLFP 

AVFAQATSPNVFQFELGRIKNVMPLSAGLFKS 

EHKNPVPQCPPRLHVQFLSHVLWSRMPNQFL 

^lmircnicnDAnin \/pjn TiPT PiFX/fsT HTPFEN 
K VD VS RJ SLKv^J " ^ v v^*-^* V r rvioj-rrnr 

RSVDBLELTEQEELLKFHYHTLRLYSAVCALG 
NHRVAHALCSHVDEPQLLYAIENKYMPGLXR 
AGYYDLL3DIHLSSYATARLMMNNEYIVPMT 
EETKSITLFPDENKXHGLPGIGLSTSLRPRMQF 
SSPSFVSISNECYQYSPEFPLDILKSKTIQML'rE 
AVKEGSLHARDPVGGTTEFLFVPLIKLFYTLLI 
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i 






i 
i 






MGIFHNEDLKHILQUEPS VFKEAA1 PEbfcbD 1 

LEKELSVDDAKLQGAGEEEAKGGKRPKEGLL 

QMKLPEPVKLQMCLLLQYLCDCQVRHR1EAI 

VAFSDDFVAKLQDNQRFRYNEVMQALNMSA 

ALTARKTKEFRSPPQEQINMLLNFKDDKSECP 

CPEEIRDQLLDFHEDLMTHCGIELDEDGSLDG 

NSDLTIRGRLLSLVEKVTYLKKJCQAEKPVES 

DSKKS STLQQLISETMVRWAQESV1EDPELVR 

AMFVLLHRQYDGIGGLVRALPKTYTINGVSV 

EDTINLLASLGQIRSLLSVRMGKEEEKLM1RG 

LGD1MNNKWYQHPNLMRALGMHETVMEV 

MV7WLGGGESKEITFPKMVANCCRFLCWCR 

I SRQNQKAMFDHLS YLLEN SS VGL ASP AMRG 

STPLDVAAASVMDNNELALALREPDLEKWR 

YLAGCGLQSCQMLVSKGYPDIGWNPVEGER 

YLDFLRFAVFCNGESVEENANVVVRLLIRRPE 

CFGP ALRGEGGNGLL AAMEEA1KI AEDPSRD 

GPSFN SGSSKTLDTEEEEDDTIHMGN A1MTFY 

S ALIDLLGRC APEMHLIHAGKGEAIRIRSILRS 

LtPLGDLVGVISlAFQMPTIAKDGNVVEPDMS 

AGFCPDHKAAMVLFLDRVYGIEVQDFLLHLL 

EVGFLPDLRAAASLDTAALSATDMALALNRY 

LCTAVLPLLTRCAPLFAGTEHHASLIDSLLHT 

VYRLSKGCSLTKAQRDSIEVCLLSICGQLRPS 

MMQHLLRRLVFDVPLLNEHAKMPLKLLTNH 

YERCWKYYCLPGGWGNFGAASEEELHLSRK 

LFWGIFDALSQKKYEQELFKLALPCLSAVAG 

ALPPDYMESNYVSMMEKQSSMDSEGNFNPQ 

PVDTSNTTIPEKLEYFINKYAEHSHDKWSMDK 

L ANGW7YGEIY SDS SK VQPLMKPYKLLSEKE 

KEIYRWPnCESLKTMLARTMRTERTREGDSM 

ALYNRTRRISQTSQVSVDAAHGYSPRAIDMS 

NVTLSRDLHAMAEMMAENYHNIWAKKJKJCM 

ELESKGGGNHPLLVPYDTLTAKEKAKDREKA 

QDELKFLQINGYAVSRGFKDLELDTPSIEKRFA 

YSFLQQLIRYVDEAHQYILEFDGGSRGKGEHF 

PYEQEIKFFAKWLPLIDQYFKNHRLYFLSAA 

SRPLCSGGHASNKEFCEMVTSLFCKLGVLVRH 

RJSLFGNDATSrVNCLHILGQTLDARTVMKTG 

LESVKSALRAFLDNAAEDLEKTMENLKQGQF 

THTRNQPKGVTQIINYTTVALLPMLSSLFEHI 

GQHQFGEDLILEDVQVSCYRILTSLYALGrSK 

S IYVERQRS ALGECL AAF AG AFP V AFLETHLD 

KJWIYSIYNTKSSRERAALSLPTNVEDVCPNIP 

SLEKLMEEIVELAESGIRYTQMPHVMEVILPM 

LCSYMSRWWEHGPENNPERAEMCCTALNSE 

HM^TLLGMLKHYNTnTGIDEGAVVMKRLAVF 

SQPONKVKPQLLKTHFLPLMEKLKKKAATVV 

SEEDHLKAEARGDMSEAELL1LDEFTTLARDL 

YAFYPLLIRFVDYNRAKWLKEPNPEAEELFR 

MVAEVFIYW SKSHNFKREEQNFV VQNEINN 

MSFLrroTKSKMSKAAVSIXJERKKMKXKGD 

RYSMQTSLIVAALKRLLPIGLNICAPGDQEL1A 

t a Krm FSLKDTEDE VRDIIRSN1HLQGKLEDP 

A1RWQMALYKDLPNRTDDTSDPEKTVERVL 

DIANVLFHLEQKSKRVGRRHYCLVEHPQRSK 

KAV^WHKLLSKQRKRAWACFRMAPLYNLPR 

HRAVNLFLQGYEKSWIETEEHYFEDKLIEDLA 

KPGAEPPEEDEGTKRVDPLHQLILLFSRTALT 

EKCKLEEDFLYMAYADIMAKSCHDEEDDDG 
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EEEVKSFEEKEMEKQKLLYQQARLHDRUAA 

Ervm.QTISASKGETGPMVAATLKLGIAILNGG 

NSTVQQKMLDYLKEKKDVGFFQSLAGLMQS 

CSVLDLNAFERQNKAEGLGMVTEEGSGEKV 

LQDDEFTCDLFRFLQLLCEGHNSDFQNYLRT 

QTGNNTTVNinSTVDYLLRVQESISDFYWYY 

SGKDV1DEQGQRNFSKAIQVAKQVFNTLTEYI 

QGPCTGNQQSLAHSRLWDAWGFLHVFAHM 

QMKI.SQDSSQIELLK£lJVtDLQKDMVVMLLS 

MLEGNVVNGTIGKQMVDMLVESSNNVEMIL 

KJFFDMFLKXKDLTSSDTFKEYDPDGKGV1SK 

RDFHKAMESHKHYTQSETEFLLSCAETDENE 

TLDYEEFVKRFHEPAKDIGFrWAVLLTNLSEH 

MPNDTRLQTFLELAESVLNYFQPFLGR1EIMG 

SAKRIERVYFEISESSRTQWEKPQVKESKRQFI 

FDVVNEGGEKEKMELFVNFCEDTIFEMQLAA 

Q1SESDLNERSANKEESEKERPEEQGPRMAFF 

SILTVRSALFALRYNILTLMRMLSLKSLKKQM 

ICKVKKMTVKDMVTAFFSSTV^SIFMTLLHFV 

ASVFRGFFRUCSLLLGGSLVEGAKKIKVAELL 

ANMPDPTQDEVRGDGEEGERKPLEAALPSED 

LTDLKELTEESDLLSDIFGLDLKREGGQYKLIP 

HNPNAGLSDLMSNPVPMPEVQEKFQEQKAK 

EEEKEEKEETKS EPEKAEGEDGEKEEKAKED 

KGKQKLRQLHTHRYGEPEVPESAFWKKHAY 

QQKLLNYFARNFYNMRMLALFVAFAINFCLL 

FYKVSTSSWEGKELPTRSSSENAKVTSLDSS 

SHRIIAVHYVLEESSGYMEPTVRILPILHTVISF 

FCIIGYYCLKVPLVIFKREKEVARKLEFDGLYI 

TEQPSEDDIKGQWDRLVTNTQSFPNNYWDKF 

VKRKVMDKYGEFYGRDRISELLGMDKAALD 

FSDAREKKKPKKDSSLSAVLNSIDVKYQMW 

KLGVVFTDNSFLYLAWYMT 


553 




A 

£\ 


4199 


31 


767 


"LPELNGRGAGLRRAEPSERGGGAER'l QQ V M 
LPLSHGHSHGGGGCRCAAER/VGAARGSAAC 
AYGLYLRIDKGRLQCLNESREGSGRGVFKPW 
ERAD\DRSKF VE SD ADEELLFNIPFTG\HVKLK 
GIIIMGEDDDSHPSEMRLYKNIPQMSFDDTER 
EPDQTFSLNRDLTGELEYATKISRFSNVYHLSI 
HISKNFGADTTKVFYIGLRGEWTELRRHEVTI 
CNYEASANPADHRVHQVTPQTHFIS 


554 


1904 


A 


4200 


1 


961 


GIPCTEMGNFDNANVTGEIEFAIHYCFKTHSL 

EICIKACKNl^YGEEKKKKCNPYVKTYLLPD 

RSSQGKRKTGVQRNTVDPTFQETLKYQVAPA 

OLVTRQLQVSVWHLGTLARRWLGEVUPLAT 

WDFEDSTTQSFRWHPLRAKADKYEDbVPQb 

NGELTVRAKLVLPSRTRKLQEAQEGTDQPSL 

HGQLCLVVLGAKNLPVRPDGTLNSFVKGCLT 

LPDQQKLRLKSPVLRXQACPQWKHSFVFSGV 

TPAQLRQSSLELTVWDQALFGMNDRLLGGTV 

RLGSKGDTAVGGDACSQSKLQWQKVLSSPN 

LWTDMTLVLH 


555 


1905 


A 


4211 


331 


2419 


"KENTCKARNLRMNQSRSRSDGGSEETLPQDH 

NHHENERRWQQERLHREEAYYQFINELNDE 

DYRLMRDHNLLGTPGEITSEELQQRLDGVKE 

QLASQPDLRDGTNYRDSEVPRESSHEDSLLE 

WLNTFRRTGNATRSGQNGNQTWRAVSRTNP 

NNGEFRFSLEIHVNHENRGFEIHGEDYTDIPLS 

DSNRDHTANRQQRS1ASPVARRTRSQTSVNFN 

GS SSNIPRTRLASRGQNPAEG SFSTLGRLRNGI 
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SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
SO: of 
3cptidc 
seq- 
uence 


Met 
hod 


SEQ 
ID NO: 

in 

L'SSN 
09/496 
914 


Predicted j 

beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine Cysteine, 
D-Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, G=<31ycine, H=Histidine, 
l=Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R^Arginine, S=Serine, 
T=Threonine, V-Valine, W-Tryptophan, 
Y -Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 














"GGAAGIPRANASRTNFSSH'1'NQSGGSELRQKb 

GQRFGAAHVWENGARSNVTVRNTNQRLEPI 

RLRSTSNSRSRSP1QRQSGTVYHNSQRESRPV 

QQTTRRSVRRRGRTRVFLEQDRERERRGTAY 

TPFSNSRLVSRITVEEGEESSRSSTAVRRHPTIT 

LDLQVRVRIRPGENRDRDSIANRTRSRVGLAE 

NTVTffiSNSGGFRRTISRLERSGIRTYVSTITVP 

LRRISENELVEPSSVALRS1LRQIMTGFGELSSL 

MEADSESELQRNGQHLPDMHSELSNLGTDN 

NRSQHREGSSQDRQAQGDSTEMHGENETTQP 

HTRNSDSRGGRQLRNPNNLVETGTLPILRLAH 

FFLLNESDDDDRIRGLT1CEQIDNLSTRHYEHN 

S1DSELGKJCSVCISDYVTGNKLRQLPCMHEF 

HIHCIDRWLSENCTCPICRQPVLGSN1ANNG 


556 


1906 


A 


AO \ *> 
1 Z 




462 


' LQRQRQHPAAAPA VP VRCFTFCFTDIV1MPKR 
KSPENTEGKIXjSKVTKQEPTRRSARLSAKPA 
PPKPEPKPRKTSAKKEPGAKISRGAKGKKEEK 
QEAGKEGTAPSENGETKAEEIHISRSTVNVST 
SRGTPPSTLSVKGQIETVRVKGTEN 


557 


1907 


A 


4213 


774 


507 


r ARRFSCLTLQTSWGHRH\GPPRP\ANFVFLV 1 1 
GFLH1GQAGHKLPTSGDPPASASQSARITGMS 
HRTWFLASFLIDSCKNHVYKIMYTL 


558 


1908 


A 


4225 


3 


1253 


TYRHAEREHPETSSA'IKVSYDYRHKRPKLLU 

GDQDFSDGRTQKYCKEEDRKYSFQKGPLNRE 

LDCFNTGRGRETQDGQVKEPFKPSKKDSIAC 

TYSNKNDVDLRSSNDKWKEKKKJCEGDCRKE 

SNSSSNQLDKSQKLPDVKPSPINLRKKSLTVK 

VDVKKTVDTFRVASSYSTERQMSHDLVAVG 

RKSENFHPVFEHLDSTQNTENKPTGEFAQEIIT 

HHQVKANYFPSPGITLHERFS\KMAI)IHKADV 

NEIPLNSDPEIHRRIDMSLAELQSKQAVTYESE 

QTL1KIIDPNDLRHDIERRRKERLQNEDEHIFHI 

ASAAERDDQNSSFSKNYTTQRKDIITHKPFEV 

EGNHRNTRVRPFKSNFRGGRCQPNYKSGLVQ 

KSLYIQAKYQRLRFTGPRGFITHKFRERLMRK 


559 


1909 


A 


4235 


1 


323 


KFSIPFFLRWSFTLVNPRLEGNDNaSVHCNUjl. 
LGLSHSPASASQVGGITGTQHHTGLIFGFLIET 
EFHHVGQAGLELLTSGDPPALAFQSAGITGVS 
HHAWLQVLNS 


560 


1910 


A 


4246 


2 


1569 


TLSLLERVLMKDIVTPVPQEEVKTVlRKCLfcQ 

AALVNYSRLSEYAKIEGKKREMYELPVFCLA 

SQVMDLTIQNQKDAENVGRLITPAKKLEDTIR 

LAELVIEVLQQNEEHHAEAFAWWSDLMVEH 

AETFLSLFAVDMDAALEVQPPDTWDSFPLFQ 

LL\NDFLRTGLLICGNGK\FHKHLQDLFAPLW 

R^MWDLDGSSPIAQSIHRGLLSRESWEPVNN 

G SGTSEDLPWKLD ALQTFIRDLH WPEEEFGK 

HLEQRLKLMASDMBESCVKRTRUAFEVKLQK 

TSS1QQIFRWQFNMAPCFNVMGLMAKGSIQP 

KL\CSMEMGQEFAKMWHQYHSKIDEL1EETV 

KEM1TLLVAKFVTILEGVLAKLSRYDEGTLFS 

SFLSFTVKAA SK YVD VPKPGMDVADA YVTF 

VRHSQDVLRDKVN^EMYIERLFDQWYNSSM 

NVICTWLTDRMDLQLHIYQLKTLIRMVKKTY 

RDFRLQGVLDSTLNSKTYETIRNRLTVEEATA 

S V SEGGGLQG I SMKDSDEEDEEDD 1 


561 


1911 


A 


4257 


1300 


654 


SELVQFLLOO)QKXIPIKRADILKiiVIGDYT<X)I 
FPDLFKRAAERLQYVFGYKLVELEPKSNTYIL 
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SEQ ID 15 
NO: of 1 
nucl- 
eotide 
seq- 
uence 


SEQID ? 
MO: of r 
>eptide 
seq- 
uence 


vlet 5 
lod 1 

■ 

; 1 

i 


>EQ 1 
D NO: 1 
n 

USSN ^ 

39/496 

914 


Predicted 

beginning 

nucleotide 

ocation 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alamne C=Cysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, OGr/cine, H-Histidine, 
Msoleucine, K=Lysine, L=Leucine, 
M=Mcthionine, N=Asparagine, P=Prolme, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, V-possible 
nucleotide insertion 














INTLEPVEEDAEMRGDQGTPTTGLLM1VLOL1 
FMKGNTIKETEAWDFLLAL\GVYPTKJCHLIFG 
DPKKLITEDFVRQRYLEYRjRJPHTDPVDYEFQ 
WGPRTNLETSKMKVLKFVAKVHNQDPKDW 
PAOYCEALADEENRARPQPSGPAPSS 


562 


1912 


A 


4260 


1 


1498 


MVTWLYRFLPTSNMAAKLRSLLPPDLRLgt 

WLHARLQKCFLSRGCGSYCAGAKASPLPGK 

MAMGLMCGRRELLRLLQSGRRVHSVAGPSQ 

WLGKPLTTRLLFPAAPCCCRPHYLFLAASGPR 

SLSTSAISFAEVQVQAPPWAATPSPTAVPEV 

ASGETADWQTAAEQSFAELGLGSYTPVGLI 

QNLLEFMHVDLGLPWWGA1AACTVFARCL1F 

PLIVTGQREAARIHNHLPEIQKFSSRIREAKLA 

GDH1EYYKASSEMALYQKKHGIKLYKPLILPV 

TQAPIFISFFIALREMANLPVPSLQTGGLWWF 

QDLTVSDPIYILPLAVTATMWAVLELGAETG 

VQ S SDLQ WMRKV1RMMPLITLPITMHFPT AV 

FMYWLSSNLFSLVQVSCLR1PAVRTVLKIPQR 

WHDLDKLPPREGFLESFKKGWKNAEMTRQ 

LREREQRMRNQLELAARGPLRQTFTHNPLLQ 

PGKDNPPNIPSS\SSSSSKPKSKYPWHDTLG 


563 


1913 


A 


4265 


623 


116 


MGGLAPTQTLEPT^EYQNTQLSVSYLLPtgN 

THGTRRTLS SGPSNNLPLPLS S S ATMPSMQCK 
HRSPNGGLFRQSPVKATPIPMSFQPVPGGVXL 
PRGSGNPPHGTSILTAPPALLPHPPTHPTQQSF 
LIQENNNTNHTHSHTHTYTETLSFFLYICVNN 

DRMEWGKSVF 


564 


1914 


A 




3 


368 


'" ILKRKLSSLNSEVSTIQNTRMLAFKATAgLMb 
GCTWCLGLLQVGPAAQVMAYLFTIINSLQGF 
FIFL VYCLLSVQQ VQK QYQK WFREI VK S KSES 
ETYTLSSKMGPDSKPSEGDVFPRTSE 


565 


1915 


A 


4288 


83 


406 


RNSRPLWCSPPASQPRQAPVSQSCCCHLFSbSS 

PPSALLAPTKPRALGTLRLYECSPELGTTMLP 

PAWLLMLCQAPRPQDPDPRLTQPEKSLQEAP 

GOTGASRTPRT 


566 


1916 


A 


4298 


1041 


229 


' LN S S QKL ACL1G VEGGH SLDS SLS VLRSF YVL 
GVRYLTLTFTCSTPWAESSTKFRHHMYTNVS 
GLTSFGEKWEELNRLGMMIDLSYASDTLIRR 
VLEVSQAPV1FSHSAARAVCDNLLNVPDDILQ 
LLKKNGGIVMVTLSMGVLQCNLLANVSTVA 
DHFDHIRAVIGSEFIGIGGNYDGTGRFPQGL\E 
D V STYP VL1EELLSRS W SEEELQGVLRGNLLR 
VFRQVEKVREESRAQSPVEAEFPYGQLSTSCH 

FHLGASEWTPRLLIWR 


567 


1917 


A 


4299 


1 


1106 


" GATPLGSVGGRTGKMDAATLTYDTLRFAJbhb 
DFPETSEPVWILGRKYSIFTEKDEILSDVASRL 
WFTYRKNFPA1GGTGPTSDTGWGCMLRCGQ 
MIFAQALVCRHLGRDWRWTQRKRQPDSYFS 
VLNAFEDRKDSYYSIHQ1AQMGVGEGKSIGQ 
WY GPNTVAQVLKKLA VFDTWSSLAVHIAMD 
NTVVMEEIRRLCRTSVPCAGATAFPADSDRH 
CNGFPAGAEVTNRPSPWRPLVLLrPLRLGLTD 
INEAYVETLKHCFMMPQSLGVIGGKPNSAHY 
F1GYVGEELIYLDPHTTQPAVEPTDGCFIPDES 
FHCQHPPCRMSIAELDPSIAWRGGHLSTQAF 
GAECCLGMTRKTFGFLRFFFSMLG 


568 
569 


1918 
1919 


A 
A 


4300 
4302 


2012 
186 


1843 
531 


SRKFLTITPIVLYFLTSFYTKYDQIHFVLN 1 Vb 
LMSVLIPKLPQLHGVRIFGINKY 
" WTFCLFIVWWVPES AR WLLTQGH VKbAHK * 
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SEQLD i 
NO: of 1 
nucl- I 
cotidc £ 
seq- \ 
uence 


>EQ ID J 
AO: of I 
>cptide 
cq- 
jence 


4et S 
)od I 
i 
1 
( 

< 


;eq i 

DNO: 1 

n i 
JSSN 1 
)9/496 

7 1 H 


Vedicted I 
jeginning i 
mclcotidc 
ocation < 
:orrespondi 

to first 
amino acid 
residue of 
peptide 
sequence 


Predicted end * 
nucleotide 
ocation 
;orresponding 
to last amino 
acid residue 
of peptide 
sequence 


^nino acid sequence (A-Alanine C-Cysteme, 
OAspartic Acid, E-Glutamic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
L=lsoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V-VaJine, W-Tryptophan, 
Y-Tyrosine, X-Unknown, *=Stop codon, 
/^possible nucleotide deletion, V=possiblc 
n\ir1<*ntide insertion 














LLHCARLNGRPVCEUS^QE VR VNVC V SMH J 
CVWWGVGCVKCLPPRAHHIWQEKPLGPHRT 
v T fqkt F.AFr T KTKEKAREKERKKKS 


570 


1920 


A 


4308 


3 


869 


RSGQGKVYGL1GRRRFQQMDVLEGLNLLI I lb 

GKRNKLRVYYLSWLRNKIIJHNDPE\rEKKQG 

WTTVGDMEGCGHYRVVKYERI1CFLVIALKSS 

VEVYAWAPKPYHKFMAFKSFADLPHRPLLV 

DLTVEEGQRLKVIYGSSAGFHAVDVDSGNSY 

DIYIPVH1QSQITPHAIIFLPNTDGMEMLLCYE 

DEGVYVNTYGRUKDVVLQWGEMPTSVAYIC 

SNQIMGWGEKAIEIRSWGHLDGVFMHKRA 

QRLKFLCERNDKVFFASVRSGGSSQVYFMTL 

>rpH^T\^MW 


571 


1921 


A 


4309 


9 


524 


" ASREMDVTKVCGEMRYQLNKTNMEKDbAt 
KEHREFRAKTNRDLEIKDQEIEKLRIELDESK 
QHLEQEQQKAALAREECLRLTELLGESEHQL 
HLTRQEKJDSIQQSFSKEAKAQALQAQQREQE 
LTQKIQQMEAQHDKTENEQYLLLTSQNTFLT 
KLKEECCTLAKKLEQISQ 


572 


1922 


A 


4318 


1 


1119 


GATPLGS VGGRTGKMDAATLTYDTLRh Ati- t, 

DFPETSEPVWILGRKYSIFTEKDEILSDVASRL 

WFTYRKNFPAIGGTGPTSDTGWGCMLRCGQ 

MIFAQALVCRHLGRDWRWTQRKRQPDSYFS 

VLNAFIDRKDSYYSIHQIAQMGVGEGKSIGQ 

WYGPNTVAQVXKKLAVFDTWSSLAVHIAMD 

NTVVMEEIRRLCRTSVPCAGATAFPADSDRH 

CN GFPAGAEVTNRPSP WRPL VLLIPLRLGLAT 

DINEAYV^TLNKHCFHGWPQFPGATVHREGK 

PNSAHYFlGYVGEELrYLDPHTTQPAVEPTDG 

CFIPDESFHCQHPPCRMSIAELDPSIAVVRGGH 

T $jn /k pn AVr.Cl .GMTRKTFGFLRFFFSMLG 


573 


1923 


A 


4333 


JO J 


1066 


GGVPVGLASKPFQILYOH'lNEVLSVGlSTbLU 
MAVSGSRDGTVIIHTIQKGQYMRTLRPPCESS 
LFLTIPNLAISWEGHTVVYSSTEEKTTLICVERM 
HY1CFSINGKYLGSQ1LKEQVSDICIIGEHIVTG 
SIOGFLSIRDLHSLNLSINPLAMRLPIHCVCVT 
KEYSHILVGLEDGKJLIWGVGKPAEVKPSISN 
FISHAVGDYFGSPSFQLIEKSPLGINKLKAKFD 

FSKGSK. 


574 


1924 


A 


4346 


359 


1234 


" " MDTLEEVTW ANGSTALPPPLAPN1 S V FHKU1J- 
LL YEDrGTSR VRYWDLLLLIPNVLFLrFLL WK 
LPSARAKJRITSSPIFITFYILVTVVALVG1ARA 
WSMTVSTSNAATVADKILWEITRFFLLAIEL 
SVTTLGLAFGHLESKSSIKRVLAITTVLSLAYSV 
TOGTLEILYPDAHLSAEDFNIYGHGGRQFWL 
VSSCFFFL VY SL WDLPKTPLKERJ SLP SRRSF Y 
VYAGILALLNLLQGEGSVLLCFDIIEGLCCVD 
ATTFLYFSFFAPLIYVAFLRGFFGSEPKILF 


575 


1925 


A 


4360 


2038 


1512 


GCWWRHP WL ASQRDCLDCRIQL AEKJ 4 V KA v 

SK^SRPDMNPfRVKEVYRLEEMEKIFVRLEM 

K11KGSSGTPKLSYTGRDDRHFVPMGLYIVRT 

VNEPWTMGFSKSFKKKFFYNKKTKDSTFDLP 

ADS1APFHICYYGRLFWEWGDGIRVHDSQKP 

ODODKLSKEDVLSFIQMHRA 


576 


1926 


A 


4365 


69 


500 


QVEGRQGREVKRTAWRlSPVWRPARCKKKb l 
PQP/PE/PGAQQQERHRQGEAPMQALDPRAEP 
GPQAQSHAACQPEPEPPRVLLDPTAARGGVQ 
GRP/GLSRHPGLAPHPQTHTPWPQSGRLPCAS 
EPLPLGGIRPTPGLEPKGRDLM 
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~§EQ1D 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQID 
NO: of 
peptide 
seq- 
uence 


Uet 

hod 1 


SEQ 

[DNO: 

in 

USSN 
09/496 

1 
1 

■ 1 


Predicted 
beginning 
nucleotide 
location 
correspond i 
ng to first 
amino acid 
residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alamne C=Cysteine, 
D=Aspartic Acid, EKJlutamic Acid, 
^Phenylalanine, G=Glycine, H=Histidine, 
l=Isoleucine, K=Lysine, L=Leucine, 
M=Methionine,N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y~Tyrosinc, X«Unlcnown, *«Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 


577 


1927 


A 


4366 


785 


502 


SAPPKKKNGVLFLSPRLKSSGAIWVHSTFILW 
ASSNSRASTPKVAGITGARPHAR1IFVFLIEMG 
FHNVG0AGI7DTLTLV1CPP0PPKLLGLQM 


578 


1928 


A 


4367 


1 


221 


FFFFLKKSRCVTQAGVQGNPISLHPPPPGFKRh' 
SRLSLLSSWDYRHP/HAANFCEFSRDGWSPYW 

SGWSRTPDLR 


579 


1929 


A 


4383 


1 


224 


FETESHSVTQAGMQWHNLGSLQPMP/PGLKR 
FSCLRLQSSWDHRHAPPHLAHFCIFSRDGVSP 
CWPGWSSTPDLK 


580 


1930 


A 


4397 


410 


94 


SRLKPYSTNVTAKKLPATNIPNLDCFTAKL Y Q 
WFKKGIMHILHELFQNKEEGAFPNS/FYEASFT 
LRPKSDRD1AKEESYSTISLLSTDTKILMSKYK 
OLKSSDL 


581 


1931 


A 


4414 


670 


3 


VLVHRQCGGILRLRRKEAVS VLDSADltV 1 Db 

RLPHATIVDHRPQHRWLETCNAPPQLIQGKA 

RSAPKPSQASGHFSVELVRGYAGFGLTLGGG 

RDVAGDTPLAVRGLLKDGPVAQRCGRLEVGD 

LVLHINGESTQGLTNHAQAVERIRAGGPQLHL 

V1RRPLETHPGKPRGVGEPRKGWPSWPDRSP 

DPGGPEVTGSRSSSTSLVQHPPSRTTLKKTRG 

SPE 


582 


1932 


A 


4424 


194 


449 


VtVlRKKKJU-EKLRHQLMPMYNFDFI'KlsQDE 
LEQELLEHGRDAASVQAATSVQAMQGKTTL 
PS\OGPLQRPSRLVFT\DVANAIHV 


583 


1933 


A 


4435 


1 


166 


APGPPVPPPGSPPEQMPGPCPASMPP/DPPPGS 
PPEQMPGPCPVSAPP/GPPPGSPPEQMPGPCPV 

SAPPALLQDTSV 


584 


1934 


A 


4439 


1 


628 


SATPQQPSAPQHQGTLNQPPVPGMDESMS Y Q 

APPQQLPSAQPPQPSNPPHGAHTLNSGPQPGT 

APATQHSQAGPATGQAYGPHTYTEPAKPKK 

GQQLWKRMKPAPGTAEVSSSTSRSDPLLLPPR 

AL APTQRASTWL APSPT/SEKVQNHSG S S AR 

GNLSGKPDDWP/LGHERVCGALLHRL*VGGG 

OGPHGKAAQGGAAGAAAGRLGLYH 


585 


1935 


A 


4463 


10 


144 


HKPVTNSRDTQEWLEKAKQVLKIIATFKHTT 
S1FDDFAHYEKRQ 


586 


1936 


A 


4464 


1309 


103 


LNAESYVSFTTKLDIPTAAKYEYGVFLg 1^U5 

FLRFPSSLTSSLCTDNNPAAFLVNQAVKCTRK 

INLEQCEEIEALSMAFYSSPEILRVPDSRKKVPI 

TVQS1VIQSLNKTLTRREDTDVLQPTLVNAGH 

FSLCVNVVLEVKYSLTYTDAGEVTKADLSFV 

LGTVSSVWPLQQKFEIHFLQENTQPVPLSGN 

PGYWGLPLAAGFQPHKGSGUQTTNRYGQLT 

ELHSTTEQDCLALEGVRTPVLFGYTMQSGCK 

LRLTGALPCQLVAQKVKSLLWGQGFPDYYA 

PFGNSQGP/ADMLDWVPIHFITQSFNRKDSCQ 

LPGALVTEVKWTKYGSLLNPQAXIVNVTANLI 

SSSFPEANSGNERmiSTAVTFVDVSAPAEAG 

FRAPP AINARLPFNFFFPFV 


587 


1937 


A 


4471 


614 


387 


" LLGRAS AC/LQLQSS W/D/HRPMLPYL ANF Vh 
CKDR/SFTWLPRLVLNSWLQVILLPWPPTGCD 

NKHEPPCPATKRRHSGSI 
— Tm>t r'ci nppppr^FVffF^n STF PSS WD YRLMFF 


588 


1938 


A 


4480 


1720 


1458 


HDLGoLl^r Jr r r Ljr isjn-t oL-LoLroo tvj-* * *n*-i ±± *■ 

CP ANFCini/DFLVETGFHHVGQASHELLTSGD 
PPTSASOSAGITGMSYHTWFGES 


589 


1939 


A 


4487 


922 


332 


APVTTSPRVGQPW/RTALALRSLYRARPSLRC 
PPVELPV/APRRGHRLSPADDELYQRTRISLLQ 
REAAQAMYIDSYNSRGFMINGNRVLGPCALL 
PHSWQWNVGSHQDITEDSFSLFWLLEPRIEI 
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SEQ ID 1 1 
NO: of 1 
nucl- 1 
eotide : 
seq- 
uence 


;eq ID II 
slO: of 1 
peptide 
seq- 
uence 


vlet i 
lod ) 


SEQ I 
DNO: 1 

USSN 
09/496 
914 


^dieted 

beginning 

lucleotide 

ocation 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A-Alaninc C=€ysteme, 
D=Aspartic Acid, E-Glutamic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, 
M=Methionine,N=Asparagine, P=Prolme, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V= Valine, W-Tryptophan, 
Y^Tyrosine, X-Unknown, *=Stop codon, 
/=possiblc nucleotide deletion, \=possible 
nucleotide insertion 














VWGTGDRTERLQ SQ VLQAMRQRGIA V b V ^ 
DTPNAC ATFNFLCHEGRVTGAALIPPPGG 1^>L 

TSLGQAAQ 


590 


1940 


A 


4492 


1 


472 


FFFFETESRS V AQ AG VQ WRDLGSLQ AP F FUr* 1 

PFSCLSLPSSWDYRRPPLRPANFFVFLVETGFP 

RFSRDGLDLLT/S/GDPPTSASQSAGITGVSHR 

ARPKRIGEPRRKCGNAWWPSTSLGDHRVTS 

VPHOGGLPGPIRVAPSSAGQREASQGPPGR 


591 


1941 


A 


4495 


1444 


1116 


IAARFTLAKTWNQLKRP\lMlDSIKXTR\Yiri 
MEYYADTERNEIMSFVAGTWVELEAIILSKLM 
LKDNWVEDTIPQGAVPCTATAEGMKRLLFAL 
EPWDSSCFPHPSSGV 


592 


1942 


A 


4496 


2 


919 


RTRPLFSGRPTRPVCTMSDERRLPGSAVCiWL 

VCGGLSLLANAWGILSVGAKQKKWKPLEFL 

LCTLAATHMLNVAVPIATYSWQLRRQRPDF 

EWNEGLCKVFVSTFYTLTLATCFSVTSLSYHR 

MWMVCWPVNYRLSNAKKQAGHTVMGIWM 

GSFILSALPAVGWHDTSERFYTHGCRFIVAE1 

GLGFGVCFLLLVGGSVAMGVICTAIALFQTL 

AVQVGRQADHRAFTVPTIVVEDAQGKRRSSI 

DGSEPAKTSLQTTGLVTT1VFIYDCLMGFPVL 

GPFSLADTHLSDLPYTWGDRJDSGGACVM 


593 


1943 


A 


4506 


2 


193 


FFFEAESCSVPQAGVQRPDLGWLHAPmo^ 
HFPASASQVAGTTHARHHTQLIFWLVENGL 

C 


594 


1944 


A 


4507 


ID/./ 


647 


KMAGGVRPLRGLRALCRVLLFLSQFClLbUO 

ESTEIPPYVMKCPSNGLCSRLPADCIDCTTNFS 

CTYGKPVTFDCAVKPSVTCVDQDFKSQKNFII 

NMTCRFCWQLPETDYECTNSTSCMTVSCPRQ 

RYPANCTVR\DHVHCLGNRTFPKMLYCNWT 

GGYKWVYGLWLLRHHPRWGLGADRFYYLGP 

VAGTASGKLFSFGGLGIWTLIDVLLIGVGYVG 

PADGSLYI 


595 


1945 


A 


4512 


533 


264 


" FFFKMESYSVARLECSGAISAPCNLHLLCiSNrs 
SPASASRV/AGN1GARHHTQQIFVLLVQMRVH 
YVGODGLDLL/NLMIHPPRSPKVLGLQA 


596 


1946 


A 


4513 


3 


1674 


" HASDHLYPNFLVNELrLKQKQRFEEKRbKLL) 
HSVSSTNGHRWQIFQDWLGTDQDNLDLANV 
NLN^ELLVQKKKQLEAESHAAQLQILMEFLK 
VARRNKREQLEQIQKELSVLEEDDCRVEEMS 
GLYSPVSEDSTVPQFEAPSPSHSSIIDSTEYSQP 
PGFSGSSQTKKQPWYNSTLASRRKRLTAHFE 
DLEQCYFSTRMSRISDDSRTASQLDEFQEC\LS 
ICRTRYNSVRPL\ATLSYASDLYNGSQYKSLV 
FEFDRDCDYFAIAGVTKKIKVYEYDTVIQDA 
VDIHYPENEMTCNSKISCISWSSYHKNLLASS 
DYEGTVILWDGFTGQRSKVYQEHEICRCWSV 
DFNLMDPKLLASGSDDAKVKLWSTNLDNSV 
ASIEAKANVCCVKFSPSSRYHLAFGCADHCV 
HYYDLRNTKQPIMVFKGHRKAVSYAKFVSG 
EEIVSASTDSQLKLWNVGKPVYCLRSFKGHIN 
EKNFVvGLASNGDYIACGSENNSLYLYYKGLS 
KTLLTFKFDTVKSVLDKDRKEDDTNEFVSAV 
CWRALPDGESNVLIAANS\QGTI\KVLELV 


597 


1947 


A 


4518 


536 


824 


RSLALSPGLECSGNDSAHCNLHLLGSSUrn 5> 
ASQVAEITSVRHHTWLIFCI^GQMGFHHVGE 
OAGLELLTSWDPAILPSQSAGIIGMSPHAWPP 


598 


1948 


A 


4524 


1 


; 384 


FDTEFVNIGGDFDAAAGVFRNCRLPGAYFhbf 
TLGKLPRKTLS VKLMKNR0EVQ AMIYDDGS S 
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514 


Predicted ] 

beginning 

nucleotide 
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correspond! 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine C=Cystetne, 
D-Aspartic Acid, E-Glutamic Acid, 
F-Phcnylalanine, OGlycine, H=Histidine, 
[=lsoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan f 
Y«Tyrosine, X«Unknown, * s =Stop codon, 
/=possible nucleotide deletion, V=possiblc 
nucleotide insertion 














RRREMQSQSVMLALRRGDAVWLLSHDHJJU 
YGAYSNHGKYTTFSGFLVYPDLAPAAPPGLG 


599 


1949 


A 


4526 


366" 


776 


MGQPAPYAEGPIQGGDAGELCKCDFLVKrSf 
KPEAVCEAGTPAMFQTAWRQMESCSI/AQAG 
VQWRDPGSLHPPPLGFKRFSCLSLPSSWDYK 
HAPPHPANFCIFSRDQVSPCWTGWSRSLDLVI 

PPPWLPKVLGLQA 


600 


1950 


A 


4529 


776 


334 


"FFFETESCYVAQAGVQWCDLCSLQAPPPU\bi> 
DPPASASRVAGTTGARHHTQLIFVFLVETGFH 
\MLARDGLKLLTSSDPPASASQSSWDYRREPP 
RLANFFVFLVETGSRYVAQAGVQWLFTGAIP 
I XISTG VLTCS VSDLGRFTPP 


601 


1951 


A 


4533 


1460 


403 


"HEVQESIHFLESEFSRGISDNYTLALITYALSb 
VG SPKAKEALNMLT WRAEQEGGMQFWV SSE 
SKLSDSWQPRSLDEEVAAYALLSHFLQFQTSE 
GIPIMRWLSRQRNSLGGFASTQDTTVALKALS 
EFAALMNTERTMQVTVTGPSSPSPVKFLIDT 
HNRLLLQTAELADGTANGSV/SISANGFGFAI 
CQLNVVYNVKASGSSRRRRSIQNQEAFDLDV 
AVKENKDDLNHVDLNVCTSFSGPGRSGMAL 
MEVNLLSGFMVPSEAJSLSETVKKVEYDHGK 
LNL YLD S VNETQFC VN I P A VRNFK V SNTQD A 
SVSIVDYYEPRRQAVRSYNSEVKLSSCDLCSD 

VQRLPSL 


602 


1952 


A 


4540 


1963 


295 


MRAPGRPALRPLPLPPLLLLLLSSPWGRAVr'C 

V SGGLPKP ANrTFLSINMKN VLQ WTPPEGLQG 

VKVTYTVQ YFI YGQKK WLNKSECRNINRT YC 

DLSAETSDYEHQYYAKVKAIWGTKCSKWAE 

SGRFYPFLETQIGPPEVALTTDEKS1SWLTAP 

EKWKRNPEDLPVSMQQIYSNLKYNVSVLNT 

KSNRTW SQC VTNHTLVLTWXLEPNTL YCVHV 

ESFVPGPPRRAQPSEKQCARTLKDQSSEFKAK 

nFWYVLPISITVFLFSVMGYSIYRYIHVGXKEIC 

HP\ANLILIYG\NEFDKRFFVPA\EKJWNFI\TL 

NIS\DDSK1SHQDMSLLGKSSDVSSLNDPQPSG 

NLRPPQEEEEVKHLGYASHLMEIFCDSEENTN 

EGTSFTQQESLSRTIPPDKTVIEYEYDVRTTDI 

CAGPEEQELSLQEEVSTQGTLLESQAALAVL 

GPQTLQYSYTPQLQDLDPLAQEHTDSEEGPEE 

EPSTTLVDWDPQTGRLCIPSLSSFDQDSEGCE 

PSEGDGLGEEGLLSRLYEEPAPDRPPGENETY 

LMOFMEEWGLYVQMEN 


603 


1953 


A 


4543 


3 


600 


" YSAVEFVEQASGISDWW f NPALRKRML5USUL 
GMIAPYYEDSDL1CDLSHSRVLQSPVSSEDHAI 
LQAVIAGDLMKLIESYKNGGSLLIQGPDHCSL 
LHYAAETGNGEIVKYILDHGPSELLDMADSE 
TGETALHKAACQRNRAVCQLLVDAGASLRK\ 
TDSKGKTPQERAQQAVGDPDLAAATIESRQN 
YKV1 GHEDLET A V 


604 


1954 


A 


4548 


3 


938 


" QbNKVQNGSLHQKDTVHDNDFEPYLlogAN 
QSNSYPSMSDPYLSSYYPPSIGFPYSLNEAPW 
STAGDPPIPYLTTYGQLSNGDHHFMHDAVFG 
QPGGLGNNTYQHRFNFFPENPAFSAWGTSGS 
(X5QQTQSSAYGSSYTYr^SSLGGTVVDGQPG 
FHSDTLSKAPGMNSLEQGMVGLKIGDVSSSA 
VKTVGSVVSSVALTGVl^GNGGTNVNMPVS 
KPTSWAA1ASKPAKPQPKMKTKSGPVMGGG 
LPPPPKHNMDIGTWDNKGPVPKAPVPQQAP 
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residue of 
peptide 
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nucleotide 
location 
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to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alamne OCysteine, 
r>Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, G-Glycine, H=Histidine, 
l=lsoleucine, K=Lysine, LHLeucine, 
M=Mcthionine, N=Asparaginc, P=Proline, 
Q=OJutamine, R=Arginine, S= Serine, 
T=Threonine, V- Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, V=possible 
nucleotide insertion 














SPQAAPQPQQYAQPLPAQPPALAQPQYQSPQ 

QPPO 


605 


1955 


A 


4553 


2 


2304 


ILLQEKRNCLLMQLEEATRLTSYLQSQLKSLCJ 

ASTLTVSSGSSRGSLASSRGSLASSRGSLSSVS 

FTDIYGLPQYEKPDAEGSQLLRFDLIPFDSLGR 

DAPFSEPPGPSGFHKQRRSLDTPQSLASLSSRS 

SLSSLSPPSSPLDTPFLPASRDSPLAQLADSCE 

GPGLGALDRLRAHASAMGDEDLPGMAALQP 

HGVPGDGEGPHERGPPPASAPVGGTVTLRED 

SAKRLERRARRISACLSDYSLASDSGVFEPLT 

KRNEDAEEPA YGDT ASN GDPQ1H VGLLRDS G 

SECLLVHVLQLKNPAGLAVKEDCKYH1RVYL 

PPLJ5SGTPNTYCSKALEFQVPLVFNEVFRIPV 

HSS ALTUKSLQL YVCS VTPQLQEELLG1AQIN 

LADYDSLSEMQLRWHSVQVFTS\LNHQGRGR 

LGVQERAPPGTLHTPSPSPA/STDAVTVLLAR 

TTAQLQAVERELAEERAKLEYTEEEVLEMER 

KEEQAEAISERSWQADSVDSGCSNCTQTSPPY 

PEPCCMGIDSILGHPFAAQAGPYSPEKFQPSPL 

KVDKETNTEDLFLEEAASLVKERPSRRARGSP 

FVRSGTIVRSQTFSPGARSQYVCRLYRSDSDS 

STLPRKSPFVRNTLERRTLRYKQSCRSSLAEL 

MARTSLDLELDLQASRTRQRQLNEELCALRE 

LRQRLEDAQLRGQTDLPPWVLRDERLRGLLR 

EAERQTRQTKLDYRHEQAAEKMLKKASKEI 

YQLRGQSHKEPIQVQTFREFOAFFTRPRTNIPPL 

PADTW 


606 


1956 


A 


4555 


3429 


776 


* PGSGPGPAPFLAPVAAPVGGISFHLQIGLSKEP 
VLLLQDSSGDYSLAHVREMACSIVDQKFPEC 
GFYGMYDKILLFRHDPTSENILQLVKAASDIQ 
EGDLIEWLSASATFEDFQIRPHALFVHSYRA 
P AFCDHCGEML WGLVXRQGUCCEGCGLNYH 
KJICAFXIPNNCSGVRPJUU^NVSLTGVSTIRT 
SSAELSTSAPDEPLLQKSPSESFIGREKRSNSQ 
SYIGRPIHLDKILMSKVKVPHTFVIHSYTRPTV 
CQYCKKLLKGLFRQGLQCKDCRFNCHKRCA 
PKVPNNCLGEVTINGDLLSPGAESDWMEEG 
SDDNDSERNSGLMDDMEEAMVQDAEMAMA 
ECQNDSGEMQDPDPDHEDANRT1SPSTSNNIP 
LMRWQ S VKHTKRKS STVMKEG WMVHYTS 
KDTLRKRHYWRLDSKCITLFQNDTGSRYYKE 
IPLSEELSLEPVKTSALIPNGANPHCFEITTANV 
VYYVGENVVNPSSPSPNNSVLTSGVGADVAR 
MWEIAIQHALMPVIPKGSSVGTGTNLHRDISV 
SISVSNCQIQENVDISTVYQIFPDEVLGSGQFG1 
VYGGKHRKTGRDVAJKIIDKLRFPTKQESQLR 
NE V AILQNLHHPG VVNLECMFETPER VF WM 
EKLHGDMLEMILSSEKGRLPEHITKFLITQILV 

> . „,Tt t tt"t^> tta n T ST>7 i/"DZTKT\n I A <s A "DPFPOV 

ALRHLHFK>nVHCULK^r£-iN VLLAaAurrry v 

JCLCDFGF ARI1GEKSFRRS WGTPA YL AP E VL 

RNKGYNRSLDMWSVGVUYVSLSGTFPFNED 

EDIHDQIQNAAFMYPFWWKEISHEAIDLTNN 

LLQVKMRKRYSVDKTLSHPWLQDYQTWLDL 

RELECKJGERYITHESDDLRWEKYAGEQGLQ 

YPTTTI .TNPSASHSDTPETEETEMKALGERVSIL 


607 


1957 


A 


" 4563 


1 


4499 


" SRPWWLRASERPSAPSAMAKRSRGPC3RKULL 
ALVLFCAWGTLAWAQKPGAGCPSRCLCFRT 
TVRCMHLLLEAVPAVAPQTSILDLRFNRIREI 
QPGAFRRLRNLKTLLLNNT^QIKRIPSGAFEDL 
ENLKYLYLYKNEIQSIDRQAFKGLASLEQLYL 
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Amino acid sequence (A= Alanine C=Cysteine, 
D=Aspartic Acid, EKjlutamic Acid, 
F=Phcnyialanine, OGlycinc, H-Histidine, 
I=lsoieucine, K=Lysine, L=Leucine, 
M=Methionine, N=Aspaiagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryplophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \-possible 
nucleotide insertion 






- 








HFNQIETLDPDSFQHLPKi.ERLFLHNNKIIHL 

VPGTFNHLESMKRLRLDSNTLHCDCEILWLA 

DLLKTYAESGNAQAAAJCEYPRR1QGRSVATI 

TPEELNCERPRITSEPQDADVTSGNTVYFTCR 

AEGWKPEIIV^RKt^LSMKTDSRLNLLDD 

GTLM1QNTQETDQGIYQCMAKNVAGEVKTQ 

FVTLRYFGSPARPTFVIQPQNTEVLVGESVTL 

ECSATGHPPPRISWTRGDRTPLPVDPRVNITPS 

GGLY1QNVVQGDSGEYACSATNNIDSVHATA 

FI IV Q ALPQFTVTPQDR WIEGQT VDFQCE AK 

GNPPPVIAV/TKGGSQLSVDRRHLVLSSGTLRI 

SGVALHDQGQYECQAYNIIGSQKVVAHLTVQ 

PRVTPVFAS1PSDTTVEVGANVQLPCSSQGEP 

EPAITWNKIXjVQVTESGKFHISPEGFLTINDV 

GPADAGRYECVARNTIGSASVSMVLSVNVPD 

VSRNGDPFVATS1VEAIATVDRAINSTRTHLF 

DSRPRSPNDLLALFRYPRDPYTVEQARAGEIF 

ERTLQLIQEHVQHGLMVDLNGTSYHYNDLVS 

PQYLNLIANLSGCTAHRRVNNCSDMCFHQKY 

RTHDGTCNNLQHPMWGASLTAFERLLKSVY 

ENGFNTPRGINPHRLYNGHALPMPRLVSTTLI 

GTETVTPDEQFTHMLMQWGQFLDHDLDSTV 

VALSQARFSDGQHCSNVCSNDPPCFSVMIPPN 

DSRARSGARCMFFVRSSPVCGSGMTSLLMNS 

VYPRJEQINQLTS YID ASN VYGSTEHEARSIRD 

LASHRGLLRQGIVQRSGKPLLPFATGPPTECM 

RDENESPEPCFLAGDHRANEQLGLTSMHTLW 

FREHNR1ATELLKLNPHWDGDTIYYETRKIVG 

AEIQHITY QHWLPKJLGEVGMRTLGE YHG YD 

PGINAGIFNAFAT\AAFRFGHTLVNPLLLPGLD 

ENFQPIAQDHLPLHKAFFSPFRIVNEGGIDPLL 

RGLFGVAGKMRVPSQLLNTELTERLFSMAHT 

V ALDL AAIN1QRGRDHGIPPYHD YRVYCNLS 

AAHTFEDLKNEIKNPEIREKLKRLYGSTLNID 

LFPALWEDLVPGSRLGPTLMCLLSTQFKRLR 

DGDRLWYENPGVFSPAQLTQIKQTSLARILCD 

NADNITRVQSDVFRVAEFPHGYGSCDEIPRVD 

LRVWQDCCEDCRTRGQFNAFSYHFRGRRSLE 

FSYQEDKPTKKTRPRK1PSVGRQGEHLSNSTS 

AVFSTRSDASGYTNDFQR VCS WEMQKTITDLR 

TQIKXLESR\LSTTECVDAGGESHANNTKWK 

KDACTICECKJX3QVTCFVEACPPATCAVPVNI 

PGACCPVCLQKRAEEKP 


608 


1958 


A 


4566 


354 


1135 


" FSFLC/GVSGRLGLDSEEDYYTPQKVDVPKAL 
nVAVQCGCDGTFLLTQSGKVLACGLNEFNKL 
GLNQCMSGIINHEAYHEVPYTTSFTLAKQLSF 
YK1RTLAPGKTHTAAIDERGRLLTFGCNKCGQ 
L G VGNYKKRLG INLLGGPLGGKQ VIR VSCGD 
EFTIAATDDNHIFAWGNGGNGRLAMTPTERP 
HGSDICTSWPRPIFGSUfflVPDLSCRGWHTILI 
VEKVLNSKTIRSNSSGLSIGTVFQSSSPGGGGE 

GGPDAW 


609 


1959 


A 


4567 


1 


412 

i 


" FFFFETESRSVAQAGVQWRDLGSLQAPPPGFT 
PF S CLSLPS S WD YRRPPLRP ANFFVFL VETGF 
HRFSRDGLDLLT/S/GDPPASASQSAGITGVSH 
RARPRINLRNViYSFAVTYCLNYISLAMSSTL 
KLSFHVLSGS 


610 


1960 


A 


4570 


697 


467 


" ECRGVISAH\CCTLCLPSSSDSASAF\RVARn 
GTCDYAQLIFAFLVEMGFHHVGQDGLHLL/N 
LVIRPPRPPKVLGLQA . 
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Amino acid sequence |A= Alanine C=Cysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, GOlycine, H=Histidine, 
l^Isoleucine, K=Lysine, L=Leucine, 
M=Metfaionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine } V=Valine, W=Tryptophan, 
Y=Tyrosine, X«Unknown, *=Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 


611 


1961 


A 


4571 


25 


1396 


ADPHTTVIRFFPAASATKRVLPPVLRVSSPK1 
WNPNVPESPRIPAPRLPKRMSGAPTAGAALM 
LCAATAVLLSAQGGPVQSKSPRFASWDEMN 
VLAHGLLQLGQG\CANTAGAHPQSAERAGA\R 
LSACGSACQGTEGSTDLPLAPESRVDPEVLHS 
LQTQLKAQNSRIQQLFHKVAQQQRHLEKQHL 
RIQHLQSQFGLLDHKHLDHEVAKPARRKRLP 
EMAQPVDPAHNVSRLHRLPRDCQELFQVGER 
QSGLFEIQPQGSPPFLVNCKMTSDGGWTVIQR 
RHDGSVDFNRPWEAYKAGFGDPHGEFWLGL 
EKVHSITGDRNSRLAVQLRDWDGNAELLQFS 
VHLGGEDTAYSLQLTAPVAGQLGATTVPPSG 
LSVPFSTWDQDHDLRRDKNCAKSLSGGWWF 
' GTCSHSNLNGQYFRSIPQQRQKLKKGIFWKT 
WRGRYYPLQATTMLIQPMAAEAAS 


612 


1962 


A 


4575 


162 


3 


FFFETESRSVAQAGVQWRDLSSLQPPPPGXSR 
GSPASASPVAGITGTRHHRTRG 


613 


1963 


A 


4584 


687 


321 


PLAQRRPFLWVTVKTNGHIWGSSTYPHFWGS 
SNS/PASASQVAGIPNARHQARIIFVFLVEPRF 
HHVGRAGLGFL/NLAICLPQHPKVLGLQACN 
LNIKJPHPAHKYISMIQFKVHFMCMSVHTYI 


614 


1964 


A 


4589 


727 


299 


PGSAQSAQRGRGRRRARAGSATQITMYSFMG 

GGLFCAWVGTrLLWAMATDHWMQYRLSGS 

FAHQGLWRYCLGNKCYLQTDSIAYWNATRA 

FMILSALCAISGIIMGIMAF/GWVAVLMTFFA 

GIFYMCAYRVHECRRLSTPR 


615 


1965 


A 

A 




2 


414 


TILPEK1QAW AQKQCPQSGEEAVAL VVHLER 

ETGRLRQQVSSPVHR£fCHSPLGAAWEVADFQ 

PEQVETQPRAVSREEPGSLHSGHQEQLNRKR 

ERRPLPKNARPSPWWALADEWNTLHQEVTT 

TRLPAGSQEPVKD 


616 


19oo 


A 




773 


488 


" DFALVAQAGVQWHNLGSPQPLPPGFKRFSCL 
SLPSSWEYRCVPP/RLANFVFLVEMGFLHVGQ 
AGLELPTSGDPPALASQSAGITGVTTVPSGPG 


617 


1967 


B 


4595 


84 


478 


XRHGLREPLLERRCAAASSFQHSSSLGRELPY 
DPVDTEGFGEGGDMQERFLFPEYTLDPEPQPT 
REKQLQELQQQQEEEERQRQQRREERRQQNL 
RARSREHPWGHPDPALPPSGVNCSGCGAEL 

HCQDAR* 


618 
619 


1968 
1969 


A 
A 


4596 
4601 


2945 
2 


1188 

357 


ARSRNSARGVYGMCVDTLFLCFLEDLERNDG 

SAERPYFMCSTLFCKPLARRCFPAIHAYKGVL 

MVGNETTYEDGHGSRKNITDLVEGAKKANG 

VLEARQLAMRIFEDYTVSWYWinGLVIAMA 

MSLLSIILLHLLAGIMGWVMIIMEIVSELGYRIF 

HCYMEYSRLRGEAGSDVSLVDLGFQTDFRV 

YLHLRQTWLAFMULSILEVinLLLIFLRKRILI 

AIALIKEASRAVGYVMCSLLYPLVTFFLLCLCI 

AYWASTAVFLSTSNEAVYKIFDDSPCPFTAKT 

CNPETFPSSNESRQCPNARCQFAFYGGESGYH 

RALLGLQIFNAFMFFWLANFVLALGQVTLAG 

AFASYYWALRKPDDLPAFPLFSAFGRALRYH 

TGSLAFGALILAJVQIIRVILEYLDQRLKAAEN 

K-FAKCLM 1 CLJvULx Wv^i^DNxirvXi^i 1 <Jvi>/^ 1 i - rvj - 
IAIYGTNFCTSARNAFFLLMRNIIRVAVLDKV 
TDFLFLLGKLLFVGSVGILAFFFFTHRIRrVQDT 
APPLNYYWVPILTVIVGSYLIAHGFFSVYGMC 
VDTLFLCFLEDLERNDGS AERP YFMS STLKKL 

LNKTNKKAAES 
" RTSV^EPYILGEF/RKLSNNTKWKTEYKATEY 
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F=Phenylalaninc, Glycine, H-Histidine, 
I-Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R^Arginine, S=Serine, 
T=Threonine, V= Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *-Stop codon, 
/=possible nucleotide deletion, V=possiblc 
nucleotide insertion 














GLAYGHFSYEFSNHRDVWDLQGWVTGNGK 

GLrVXTDPQIHSVDQKVFTTNFGKRGIFYFFN 

NQHVECNE1CHRLSLTRPSMEKPCKS 


620 


1970 


A 


4606 


1 


2415 


MERLWGLFQRAQQLSPRSSQTVYQRVEGFK 

KGHLEEEEEDGEEGAETLAHFCPMELRGPEP 

LGSRPRQPNLIPWAAAGRRAAPYLVLTALL1F 

TGAFLLGYVAFRGSCQACGDSVLVVSEDVN 

YEPDLDFHQGRJtYW SDLQAMFLQFLGEGRL 

EDTIRQT SLRER V AG S AGMAALTQDIRAAL S 

RQKLDHVWTDTHYVGLQFPDPAHPNTLHWV 

DEAGKVGEQLPLEDPDVYCPYSAIGNVTGEL 

VYAHYGRPEDLQDLRARGVDPVGRLLLVRV 

GVTSFAQKVTNAQDFGAQGVLIYPEPADFSQ 

DPPKPSLSSQQAVYGHVHLGTGDPYTPGFPSF 

NQTQFPPVASSGLPSIPAQPISADIASRLLRKL 

KGPVAPQEWQGSLLGSPYHLGPGPRLRLWN 

NHRTSTPINNIFGCIEGRSEPDHYWIGAQRDA 

WGPGAAKSAVGTAILLELVRTFSSMVSNGFR 

PRRSLLFISWDGGDFGSVGSTEWLEGYLSVL 

HLKAVVYVSLDNAVLGDDKFHAKTSPLLTSL 

ffiSVLKQVDSPNHSGQTLYEQVVFTN\PSWD\ 

AEVIRPLPM\DSSAY\SFTAFVGVPAVEFSFME\ 

DDQVAYPFLHTKEDTYENLHKVLQGRLPAVA 

QAVAQLAGQLLIRLSHDRLLPLDFGRYGDW 

LRH1GNLNEFSGDLKARGLTLQWVYSARGDY 

IRAAEKLRQEIYSSEERDERLTRMYNVRIMRV 

EFYFLSQYVSPADSPFRHIFMGRGDHTLGALL 

DHLRLLRSNSSGTPGATSSTGFQ\ESRFRRQL\ 

ALL\TWDACKGAANALSGDVWNIDNNF 


621 


1971 


A 


4610 


793 


334 


ISRVDDFVGSGIANVIIAVAIFSIPAFARLVRG\ 

NTLVLKQQTFIESARSIGASDMTVLLRHILPGT 

GSSIWFFTMRIGTSIISAASLSFLGLGAQPPTP 

EWGAMLNEARADMVIAPHVAVFPALAIFLTV 

LAFNLLGDGLRDALDPKIKG 


622 


1972 


A 


4614 


2 


820 


LVYVMIAIFCIASAMSLYNCLAALIHKJPYGQ 

CTIACRGKNMEVRLIFLSGLC1AVAWWAVF 

RNEDRWAWILQD1LGIAFCLNLIKTLKLPNFK 

SCVILLGLLLLYDVFFVFITPFITKNGESIMVEL 

AAGPFGNNEKNDGNLVEATGQPSAPHEKLPV 

VIRVPKLIYFSVMSVCLMPVSILGFGDHVPGL 

LIAYCRRFDVQTGSSYIYYVSVXTVAYAIGMIL 

TFWLGVLMKKGQPALLYLVPCTLITAyCQFV 

AWETVREMKKFWERVTS 


623 


1 (171 


A 

A 


4619 


17 


691 


TLVSVVEFVRRADLTREDLAPSSVDSGQAGF 

GGCCESGLPNTMPSAFSVSSFPVSIPAVLTQT 

DWTEPWLMGLATFHALCVLLTCLSSRSYRLQ 

1GHFLCLVILVYCAEYINEAAAMNWRLFSKY 

QYFDSRGMFISIVFSAPLLVNAMIIWMWVW 

KTLNVMTDLK^AQERRKEKXRRRKED'GAA 

AAWSLRPSRPPSAAPSAAVCVAWASFQLTHG 

LKNRCFI 


624 


1974 


A 


4622 


164 


668 


" VSCYTALQSIMNQPESANDPEPLCAVCGQAH 

TPCGHTYCTLCLTNFLVEKDFCPMDRKPLVL 
QHCKXSSILVNKLLNKLLVTCPFREHCTQVL 
QRCDLEKHFQTSQA WGTHL* SQLLGRLRQED 
CLSPGVHHCSEV 


625 


1975 | A 


4625 


474 


473 


CFLSPSPLLPPLLLSSSSSPSFPLPPPPTLLPS'I'LP 
PPLLIPSS*LSP 
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SEQ ID 
NO: of 
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seq- 
uence 


SEQ ID I 
NO: of 
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seq- 
uence 


Met , 
hod ! 

i 


SEQ 
ID NO: 

USSN 1 

09/496 

914 


Predicted 

beginning 

nucleotide 

location 

conespondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alamne C=Cysteine, 
D=Aspartic Acid, E-Glutamic Acid, 
F-Phenylalanine, OOlycmc, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V= Valine, W=Tryptophan, 
Y-Tyrosine, X-Unknown, *=Stop codon, 
/=possiblc nucleotide deletion, \=possible 
nucleotide insertion 


626 


1976 


A 


4629 


249 


3 


KLKGNECFC YHCNVCIFLMIKK* GLFLC* I Y M 
LFFET*SHSFTRLECSGT1SAHCSLQLQGSSNSP 

ASASQVAGIAGTHH 


627 


1977 


A 


4635 


1 


301 


" FFFFETKPFFAPQAGGQGPSRGSLNPLPTOLK. 
QFSGLTLSRSGNNGPRPPPRVNFGILRGNGVP 
PGGAG' PRPPDLRGPPGLAPPQGGNNGGDPP 
ARAYL 


628 


1978 


A 


4648 


1357 


782 


KLFS SQRLFGPHIQ AINP SFLLLSFFPS * LL AMR 
TVGNNAPILVFLVYRIVLLLF*HV*PAYFQPS1C 
NKTAKENCN* RPFLFL VC YLL* AELHIGIFI ANF 
YDCIPNKLNEHLWPKLLQSLIFHVDFCGFLHK 
VFY1CFTEFLLFLYFL*LFIIKVSCSII*CSTICVF 
SYKSFAVIIFFVDNTRFFSFGF 


629 


1979 


A 


4660 


18 


999 


HHELHTLELLQNPKEVLTRSEIQDVNYSLbAV 

KVKTVCQIPLMKEMLKRFQVAVNLAEDTAH 

PKLWSQEGRYVKKTASASSWPVFSSAWNYF 

AGWRNPQKTAFVERFQHLSCVLGKNVFTSG 

KHYWEVESRDSLEVAVGVCREDVMGITDRS 

KMSPDVGIWAIYWSAAGYWPLIGFPGTPTQQ 

EPALHRVGVYLDRGTGNVSFYSAVDGVHLH 

TFSCSSVSRLRPFFWLSPLASL\OPPVTDRK*G 

FSSPDQNSFPWQLRDTHPWALFCPSCLYPG 

WSIFWVSLTVPFGICPLCASQEAVPWEVGLA 

NGDGTGNFPRRFWEIFL 


630 


1980 


A 


4669 




358 


"FFFFFETESHSVAQAGMQWRNLGSLPAPPFUh 
TPFFCLSLLNGWDYRRPPPHLANFFVLLVETG 
FHDVGQDGLDLLTS* STPSASQSAEITGVSHC 
TRLKKIRFAKGHVEFFFESHVE 


631 


1981 


A 


4674 


953 


614 


TPIRGTDDEHEECTVQEYSAGKNTCLRPUAV 
AHTCNPCTLGGRGR\VIT*GSGVQDQPGPTWQ 
NPVFLERRPRALHSSPGLTTQRILWAQGLWV 
GAGSTGCSRGPRGEGVFREG 


632 


1982 


A 


4678 


34 


314 


" RSTHASGMISPSFGFMGHLLRLEFEILPb 1 FNF 
*LPSYQGEAAGSSL1SHLQTFSPDLKGVYCTFP 
ASGLAPVPTHWTVSELSRSPVATATFC 


633 


1983 


A 


4696 


1 


1365 


" RTLGMEGERRASQAPSSGLPAGGANGESPGU 
GAPFPGSSGSSALLQAEVLDLDEDEDDLEVFS 
KDASLMDMNSFSPMMPTSPl^MrNQIKFEDEP 
DLKDLFITVDEPESHVTTIETFITYRIITXTSRG 
EFDSSEFEVRRRYQDFLWLKGKLEEAHPTLII 
PPLPEKFIVKGMVERFNDDFIETRRKALHKFL 
NRIADHPTLTFNEDFKIFLTAQAWELSSHKKQ 
GPGLLSRMGQTVRAVASSMRGVKNRPEEFM 
EMNNFIELFSQKINLIDKISQRIYKEEREYFDE 
MKEYGPIHILWSASEEDLVDTLKDVASCIDRC 
CKATEKRMSGLSEALLPVXTOYVLYSEMLM 
GVMKRRDQIQAELDSKVEVLTYKXADTDLL 
PEEIGKLEDKVECANNALKADWERWKQNM 
QNXiIKLAFTDMAEENIHYYEQCLATWESFLT 
S QTNLHLEE ASEDKP 


634 


1984 


A 


4708 


42, 


" 158 


" S YWVGEDYTYKFFEVlLIDPFHKAlRKNl'U 1 y 
WI5KAVYKHREMCGLTSTGRKSHGLEKT)RM 
FPHAIGGSCRAA*RRRKTLQFPCYH 


635 


1985 


A 


4709 


42 


341 


" YTKQPDAKERRRTVHWKKETESEASEI I IFF* i 
PGVPQAPGHWEDYGRGDNFYLPH*DPGGIVL 
WNIFNRMPIARKNITDGEHHEYLIEVPRLFHT 

SED 


636 


1986 


A 


4721 


2 


351 


EKPDHFFPEGTSFIHEPRRPN ♦ GDL VHCLGG1S 
RSTTVTVA* LMQKLNLSMNDAYYTVIMKMS S 
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to last amino 
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sequence 


"Amino acid sequence (A=Alanme OCysteine, 
D=Aspartic Acid, E-Glutamic Acid, 
F=Phenylalanine, G^Glycinc, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, 
M-Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threoninc, V-Valine, W=Tryptophan, 
Y=Tyrosme, X-Unknown, *=Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 














ISPNFNSMDQPLDFQRTLGLRSPCYNRVPAQK. 
MYFTTPSNHNAYQVDSVQST 


637 


1987 


A 


4726 


664 


253 


"NTGLTCSIQRKCGETQLYRREENRLILLLQUH 
LKSESFQVLTLSPRLEFSGLISAHCNLRLPGSS 
D S S AS S SRAAGITG VHHHA WLIFFFL VETGFL 
HAG*AGLELLTSGDPPASASRSAGITGVSHHA 

RPRETRFL 


638 


1988 


A 


4734 


24 


592 


"GGMDSRVSGTTSNGETKP V Y PVMEKKEEDG 
TLERGHVmNKMEFVLSVAGEUGLGNVWRFP 
YLCYKNGGGAFFIPYLVFLFTCG1PVFLLETAL 
GQYTSQGGVTAWRKICPIFEGIGYASQMIVTL 
LNVYYIIVLAWALFYLFSSFT1DLPWGGCYHE 
WNTEHCMEFQKTNGSLNGTSENATSPVIEFW 


639 


1989 


A 


4743 


1040 


"699*" " 


QGLTLLPRMECSATlTAHCSLELPGSIDLFrSA 
S * V ARTTGTHHHP WL1LVLLL*TWGSYYV AQ 
AGLELLGSSNLPAAMVSQSAQI1GHDHCAWA 
TSNHVLYTQEGLRRGKEG 


640 


1990 


A 


4771 


527 


2 


" G RI DCPHP AT VL AQPEF1D ACSVLG A Y QCi A<^ N 
WIRRMCLPSGCLJCMNREIGPLQHSLCCPGWS 
QTPGLKAILLRQPPK* LGLQMESHSCPPA WS A 
MARSRLTATSASQVQAILLPQPPGTTDSCSPS 
PDHEQQPLSWVLPPPQKDMNPRJEQQVALGP 
OAAALPWAVWRNDCFPR 


641 


1991 


A 


4780 


16 


473 


RPSSQCGGIPTGWKKGLAPELSSELSSPPLFAR 

LQLAASPYFSPSWAECPQPVPAGTHATWCLA 

RVWARMTPPGPAG1PSHPLPPPPPERSVPIPSP 

FPARDSGSRQGHSTDRYKHTDAPRDAHRRVP 

ORDTDTGVHTGSGTHTHAHTPPEK 


642 


1992 


A 


4798 


1 


487 


GYSFRCDIVDYSRSPTALRMARTCWLY Yh'SK. 
FIELLDTIFFVLRKKNSQVTFLHVFHHTIMPW 
TWWFGVKFAAGGLGTFHALLNTAVHVVMY 
SYYGLSALGPAYQKYLWWKKYLTSLQLVQF 
VTVAIHISQFFFMEDCKYQFPVFAC1IMSYSFM 

FT 1 1 ,H 


643 


1993 


A 


4799 


2 


391 


LMAFIEMHlSGSLVYLKiKTKlYSYFb T MLN^l.L 

QEIPLSEILRISSPRDFTNlSQGSNPHCFEirrDT 

MVYFVGENNGDSSHNPVLAATGVGLDVAQS 

WEKA1RQALMPVTPQASVCTSPGQGKDHSK 

O'ASVCTSPGQGKDHSKQ 


644 


1994 


A 




488 


101 


" AYPLFAVHPVHTECVAGWGRAYLLCAl^l-L 
LSFLGYCKAFRESNKEGAHSSTFWVLLSLFLG 
AVAMLCKEQGITVLVRAATWLGPAFSVCPFP 
SYKDIWGWPCLCGVLHAYIPLLV 


645 


1995 


A 


4805 


458 


126 


" LL WTT VLCQTP ARPQ STM1HLGHILFLLLLP V 
AAAQTTPGERSSLPAFYPGTSGSCSGCGSLSL 
PLLAGLVAADAVASLLIVGAVFLCARPRRSP 
AOEDGKVYINMPGRG 


646 
647 


1996 
1 1997 


A 
A 


4817 
J 4854 


47 
1044 


1033 

1 

i 


" LQGDTWHLSFLSHFSRLHGGVPGRGLLhCrNL 
LQPQAPGHDMTSIPFPGDRLLQVDGVILCGLT 
HKQ A VQCLKGPGQ V ARJL VLERR VPRSTQQC 
PSANDSMGDERTAVSLVTALPGRPSSCVSVT 
DGPKF* SSN*KKIANGLGFSFVQMEKESCSHL 
T/cnt vttTKRI FPGHP AEENG AIAAGD11LGRE 
WEGPRKASSSRCRGSWAMQLSVQAGPSFAS 
YYPAAVEVLHLLRGAPQEXTLLLCRPPPGAL 
PELEQEWQTPELSADKEFTRATCTDSCTSPIL 
GSRGQLGGTVPPQMQGKAWGLRPESSQKAIR 
EGTMGAKTERDLGPVP 

PRVRGDWPLEKKKSN SMHPEFSWCGS 1 DblUJ 
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to last amino 
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of peptide 
sequence 


Ajruno acid sequence (A-Alanine C-Cysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
F-Phenylalanine, G-Glycine, H=Histidinc, 
I=lsoleucine, K=Lysinc, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutaminc, R=Arginine, S^Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/=*possible nucleotide deletion, \-possible 
nucleotide insertion 














1 VMPTYDLTDS V LETMGRV S LDMMS V AJN i 

GPPVV^SKNSTAVWRGRDSRKERLELVKLSRK 

HPELIDAAFmFFFFKHDENLYGPlVKHISFFD 

FFKHKYQINIDGTVAAYRLPYLLVGDSWLK 

QDSIYYEHFYNELQPWKHYIPVKSNLSDLLEK 

LKWAKDHDEEAKKIAKAGQEFARNNLMGD 

DIFCYYFQTFPRNMPIYK 


648 


1998 


A 


4867 


2030 


837 


AGMLPAVGSADEEEDPAEBDCPELVFMh-ri V? 

SEEEEKSGLGAKIPVTIITGYLGAGKTTLLNYI 

LTEQHSKRVAVILNEFGEGSALEKSLAVSQG 

GELYEEWLELRNGCLCCSVKDNGLRAIENLM 

QIOCGKEDYILLETTGLADPGAVASMFWVDA 

ELGSDIYLDGIITIVDSKYGLKHLAEEKPDGLI 

NEATRQVALADAILINKTDLVPEEDVKKLRT 

TIRSINGLGQILETQRSRVDLSNVLDLHAFDSL 

SGISLQKKLQHVPGTQPHLDQSIVTITFDVPG 

NAKEEHLN^IQNLLWEKNVRNKDNHCMEV 

IRLKGLVSIKDKSQQV1VQGVHELYDLEETPV 

SWKDDTERTNRLVLLGRNLDKDILKQLFIAT 

VTETEKOWTTHFKEDQVCT 


649 


1999 


A 


4873 


226 


189 H 


E>G VSLLLPKLGVQ WAQY W AHW QPPLFUMtK 
FSCLSLRSSWD+KCAPPHPAFVFLVEMGFHRV 
GQAGLELRTSGDPPAS ASQSAGITGVSHLA* P 
TSMPLLPFQRLCVYI 


650 


2000 


A 


4874 


2 


437 


FFFLRJISFAFVAQAGVQWCDLGSPQPLPPGF 
K*FSCLSLPSSWDYUHAPPPCPS*FLYF**RQG 
FTMLARLVLNS*PHDLPTSPSQSAEIKGVSHR 
CP ASF YLFLK Y YLE AKFCA* GECAPS AG VG A 
GYKRGHKSCLLrNCWQI 


651 


2001 


A 


4898 


1701 


771 


"DAWGPETRLARILNPDSFJEPRPGRLPELbAi K 
PHMEPKA SCPAAAPLMERKPHVL VGVTG S V 
AALKLPLLVSKLLDIPGLEVAVVTTERAKHFY 
SPQDIPVTLYSDADEWEMWKSRSDPVLHIDL 
RRWADLLLVAPLDANTLGKVASGICDNLLTC 
VMRAWDRSKPLLFCPAMNTAMWEHPITAQQ 
VDQLKAFGYVE1PCVAKKLVCGDEGLGAMA 
E VGTI VDKVKE VLFQHSGFQQS* PGI SVMG VP 
LYSEWVQAKSVKMDVGK1GGYPHLLNGGPA 
LSLPRGOACSRLNWTEGPGLSFFQPGEAAA 


652 


2002 


A 


4927 


1 


611 


" FRGRQTSRPARGFSPWRPPGTMQEPSSUhUFA 
SP*LPCASNRLAFGGLDFPCAPLVPYPAPFSPLL 
PAFSCAPRPRAHTHSRTHPSAPLVPKPSSRAR 
GQSPIPSRASSPSCSWAQVPGVALARCAGVC 
KPGDSWRVAACISGRCCSRGRRRGSGPRNPE 
QSFRGAWGPSFWGSW1CSQRELSAGGAQAWP 
LLGSAGSGLRGEA 


653 


2003 


A 






283 


" FFFFI*DGVSLCHPGWNAVARSWLTA'l SASK 
VQAVSCFRLPSSWDYRHATMPG*FF*YF**R 
WGFTrLAILVLNS*PQVlCPPWPPKVLTLQA 


654 


2004 


A 


4968 


3 


437 


RPGIPGRRFRRSWFCQLP^EPEPGLESLA 1 KbU 

IPAVGLGALGV1PPVRVPQRPPTQRSQGRGW 

DPERDPGCRVQVSRGPRFGEQKTPGLQGCLP 

PPCLTHI^^SCVVVWCGRWKRDSAECQCD 

HSCSAVSOOEDRCRSSSCS 


655 


2005 


A 


4983 


201 


397 


a, MhTN^TTCIQPSMlSSMALP1IYILLCIVUVhUN 
TLSOWIFLTKIGKJCTSTHIYLSHLVTAKLLVC 


656 
657 


2006 
2007 


A 
B 


4988 
5008 


332 
129 


159 
465 


L VHKDMYREFFEEEAQ ASNKHVTRCLTSLV I 
RE VHIKTMR* HF1PIRLEK>JKNNDCD 
""MAGMKTASGDYIDSSWELRVFVGEEDPEAES 
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sequence 
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sequence 


Amino acid sequence (A~ Alanine C-Cysteine, 
D-Aspartic Acid, E-Glutamic Acid, 
F=Phenylalanine, GKttycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, 
M-Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Ai^inine, S=Serine, 
T-Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosinc, X=Unknown, *«Stop codon, 
/=possiblc nucleotide deletion, \=possible 
nucleotide insertion 














VTLRVTGESHIGGVLLKJVEQINRKQDWSDH 

AIWWEQKRQWLLQTHWTLDKYGILADARLF 

FGPOHRPVILRLPNRRALRLX* 


658 


2008 


A 


5017 


1 


292 


FFFFKETESHSVTQAGVQWHDLGSLQPPPPGF 
KRFSCLSLLSSWDYRCAPPHPANFVFLVETGF 
HHV AQAGLKLLTL* S ANLGLSTSLPIPLFILLS 


659 


2009 


A 


5018 


1 / 


338 


RGHGGKSLTGGTPGNWGDGLLVSEDWSHL1F 
T*NSLVSPVLGKWSPCLQGPGLSAVHTWPWL 
MAACWAVHVKTHMRPGLAVLPRLVLNSWS 
♦ AIILL WPPKALGLQ A 


660 


2010 


A 


5028 


2 


1 1 a 


SRVDDFVGERRGGCDECLCGHRGLRAVPLG 

HPGHLCLQPPGGPA*FLDYCRGCCPHPVPGST 

AGSCPRQKKTTPGPTVLCVCSFWIYQRGEPH 

HRTGARWNH 


661 


2011 


A 


5050 


752 


431 


RQSCSSTQAKVQWFHYGPLQSQPPGLKQSbg 
LSLPNSRDHRHVPPRLAIFSFAETGSPYFAQAS 
LELLGSSHPPTSASQS AR1TGVSHRA WPLK* F 
NLNQYQTLTMN 


662 


2012 


A 


5054 


48 


103 


ELNNGPFQMPLCNGGNLAVTG SW ADRSFLH 

EAASQGRLLALRTLLSQGYNVNAVTLDHVTP 

LHEACLGDHVACARTLLEAGANVNAJTIDGV 

TPLFNACSQGSPSCAELLLEYGAQAQLESCLP 

SPTHEGASKGHHECLDILISWGIDVDQEIPHSG 

TPLYVACMAQQFHCIWNHYAGAGVRKGKY 

WDTPLPGAGHQSTQKLE*LFAMVEIWQ 


663 


2013 


A 


5066 


951 


580 


\TO^S*SFAHCASVYKHHYMDGQTPCLFVSSK 
ADLPEGVAVSGPSPAEFCRKHRLPAPVPFSCA 
GP AEPSTTIFTQLATMAAFPHL VHAELHPSSF 
WLRGLLGVVGAAVAAVLSFSLYRVLVKSQ 


664 


2014 


A 


5071 


550 


1 


LSFIEVLSMEQVNKTVVREFVVLGFSSLARLQ 

QLLFVIFLLLYLFTLGTNAIIIST1VLDRALHTP 

MYFFLAELSCSEICYTFVIVPKMLVDLLSQKK 

TISFLGCA1QMFSFLFFGSSHSFLLAAMGYDR 

YMAICNPLRYSVLMGHGVCMGLMAAAWAC 

GFTVSLVTTSLVFHLPFHSSNQHE 


665 


2015 


A 


5074 


496 




QQYHNTGSAGHHAHCQVGHSPHVHYPSGCXi 

PL* IQRGLPSFNSLEGHSLKDSGHEES VQLDSE 

HDVQRSLYCDTAVNDVLNTSVTSMGSQMPD 

HDQNEGFHCREECRILGHSDRCWMPRNPMPI 

RSKSPEHVRNIIALSIEATAADVEAYDDCGPT 

KRTFATFGKDVSDHPAEERPTLKGKRTVDVT 

ICSPKVNSVIREAGNGCEA1SPVTSPLHLKSSL 

PTKPSVSYEIVDPGITARRC 


666 


2016 


A 


5080 


408 


248 


' TKSXSTSS* VYFQSSTKDSHFFLFDFQKTGPPL 
VGPKAOLSGLQLQPCLYKRR 


667 


2017 


A 


5081 


129 


247 


"DLTNSHFFLFDFQKTGPPLGGPKAQFSSLQLQ 
PPVY*RR 


668 


2018 


A 


5086 


852 


233 


NDCSNDRWVQIKTAYKYFF*KNGDNYNWVh 

RALPTTFADIENLKYLLFTRDASQPFYLGHTV 

IFGDLEYVTVEGGIVLSRELMKRLNRLLDNSE 

TCADQSVIWKLSEDKQLAICLKYAGVHAENA 

EDYEGRD VFNTKPI AQL 1EE ALSNNPQQ VVEG 

rrcnx/i a TTPTJfTf TPOKVTEVMMYGLYRLRAF 

GHYFNDTLVFLPPVGSEND 


669 
670 


2019 
2020 


A 
A 


5101 
5102 


1 

3 


329 

j 

| 547 


" pGRPTRPPLLTLLAHVSPEPAGPSCDSLAgFCi 
ASGV* VQHDSHPPLLCGSQCLSEPVPGSHGPP 
RGCQHEAAPCPRGPGSDGLHHASAACASLPP 
SPILPVLLPELGPL 

' DAWGNRCAVGAAPRLIHLHLCCTPAD^SKKr 
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SEQ ID 
NO: of 
nucl- 
eotide 
seq- i 
uence 


SEQ ID | 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ : 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 
beginning 
nucicuuut; 
location 
correspondi 
ng to first 
amino acid 
rpsirhie of 

1 VOIUUW VI 

peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A-Alaninc OCysteme, 
D-Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, LHLeucine, 
M=MethionLne, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X^Unknown, *=Stop codon, 
/=possible nucleotide deletion, \=possiblc 
nucleotide insertion 














DEL*NMNGRVDYLVTEEEINLTRGPSGLGFNI 

VGGTDQQYVSNDSGIYVSRIKENGAAALDGR 

LQEGDKILSVNGQDLKNLLHQDAVDLFRNA 

GYAVSLRVQHRLQVQNGPIGHRGEGDPSGIPI 

FMVLVPVFALTMVAAWAFMRYRQQL 


671 


2021 


A 


DJU-) 


O /Z 


400 


" RDGREELCLQQEPTLPSRICSSAPLLYFLFICPK 
VLLLLLLISLLCLYWKARKI^TLRSNTRKEKA 
LWVDLKEAGGVTTNRMED*EEDECN 


672 


2022 


A 


5148 


72 


314 


IIYFSYNIFLKITELLNDVERLKQALNGLSQL T 
YTS GNPTKRQ SQLIDTLQHQ VKSLEQQLA V S 
NO AHG ALQE YVL APC S 


673 


2023 


A 


5152 


210 


335 


REELCSRIGRLNIV*MSLFPNLTCRLNAIP1KIPA 
NHFVEVT 


674 


2024 


A 


5153 


3 


2953 


LTEDQPFDILQKSLQEANITEQTLAEEAYLDA 

SIGSSQQFAQAQLHPSSSASFTQASNVSNYSG 

QTLQPIGVTHVPVGASFASNTVGVQHGFMQH 

VGISVPSQHLSNSSQISGSGQIQLIGSFGNHPS 

MMTINNLDGSQIILKGSGQQAP SN V SGGLL V 

HRQTPNGNSLFGNSSSSPVAQPVTVPFNSTNF 

QTSLPVHNinQRGLAPNSNKVPINlQPKPIQM 

GQQNTYNVNNLGIQQHHVQQGISFASASSPQ 

GSWGPHMSVNIVNQQNTRKPVTSQAVSSTG 

GSIVIHSPMGQPHAPQSQFLIPTSLSVSSNSVH 

HVQTINGQLLQTQPSQLISGQVASEHVMLNR 

NSSNMLRTNQPYTGPMLNNQNTAVHLVSGQ 

TFAASGSPVIANHASPQLVGGQMPLQQASPT 

VLHLSPGQSSVSQGRPGFATMPSVTSMSGPSR 

FPAVSSASTAHPSLGSAVQSGSSGSNFTGDQL 

TQPNRTPVPVSVSHRLPVSSSKSTSTFSNTPGT 

GTQQQFFCQAQKKCLNQTSPISAPKTTDGLR 

QAQIPGLLSTTLPGQDSGSKVISASLGTAQPQ 

QEKWGSSPGHPAVQVESHSGGQKRPAAKQ 

LTKGAFILQQLQRDQAHTVTPDKSHFRSLSD 

AVQRLLSYHVCQQSMPTEEDLRXVDNEFETV 

ATQLLKRTQAMLNKYRCLLLEDAMRINPPAE 

MVMIDRMFNQEERASLSRDKRLALVDPEGFQ 

ADFCCSFKLDKAAHETQFGRSDQHGSKASSS 

LQPPAKAQGRDRAKTGVTEPMNHDQFHLVP 

NHIWSAEGNISKKTECLGRALKFDKVGLVQ 

YQSTSEEKASRREPLKASQCSPGPEGHRKTSS 

RSDHGTESKXSSILADSHLEMTCNNSFQDKSL 

RNSPKNEVLHTD1MKGSGEPQPDLQLTKSLET 

TFKNILELKKAGRQPQSDPTVSGSVELDFPNF 

SPMASQENCLEKFIPDHSEGWETDSILEAAV 

NSILEC 


675 


2025 


A 


5154 


599 


1880 


" LKKMEPFSCDTFVALPPATVDNRIIFGKNSDR 
LYDEVQEVVYFPAVVHDNLGERLKCTYIEID 
QVPETYAWLSRPAWLWGAEMGANEHGVCI 
GNEAVWGREEVCDEEALLGMDLVRLGLERA 
DTAEKALNVTVDLLEKYGQGGNCTEGRMVF 
S YHN SFLIADRNEA WILET AGK Y W AAEK V QE 
GVRNISNQLSITTKIAREHPDMRNYAKRKGW 
WDGKKEFDFAAA Y i> Y Lu i AJsJViivi 1 sduivi 
GYiaLNKmGNITFETMMEILRDKPSGINME 
GEFLTTASMVFTLPQDSSLPCIHFFTGTPDPER 
SVFKPFIFVPH1SQLLDTSSPTFELEDLVKKKS 
HFKPDRRHPLYQKHQQALEVVNNNEEKAKJ 
MLDNMRKLEKELFREMESILQNKHLDVEKJV 
NLFPQCTKDEIQIYQSNLSVKVSS 
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SEQID t 
NO: of 1 
nucl- I 
cotide i 
seq- i 
ucnce 

i 


iEQID } 
slO: of r 
jcptidc 
>eq- 
jence 


viet 5 

i 

1 


SEQ I 
r>NO: \ 
n i 
JSSN 1 
39/496 
914 


Predicted ] 

xginning 

mclcotidc 

ocation 

^orrespondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 

UvAll vFll 

corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanme OCysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
^Phenylalanine, G=Glycine, H=Histidine, 
l=lsoleucine, K=Lysine, L-Leacine, 
M=Methionine,N=Asparagine, P=Proline, 
QOlutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y -Tyro sine, X-Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possib!e 
nucleotide insertion 


676 


2026 


A 


5155 


2 


306 


FFFLRRSL ALSPRPDCGLQ WKKLGSLQAPP PG 
FTPFSCLSLPSSWD YRRPPPRPANFLYF* *RRG 
FTIXARMVS1S*PHDPPASASQSAG1TGVSHRA 

PPT 


677 


2027 


A 


5167 


97 


740 


FFHSVDLLALEQSKTFYKPDWFDIVtSbV^UU 

KEAVCV1DMSSFTEFE1TSTGDQALEVLQYLF 

SNDLDVPVGH1VHTGMLNEGGGYENDCSIAR 

LVKRSFFMISPTDQQVHCWAWLKKHMPKDS 

NLLLEDVTWKYTALNLIGPRAVDVLSELSYA 

PMTPDHFPSLFCKEMSVGYANGIRVMSMTHT 

GEPGFMLY1PIEYRWGFTMLSTLVSNS 


678 


2028 


A 


5183 


1919 


2018 


" PALCRLRDDMTVCVADFGLSKKIYSGU Y YRg 
GR1AKMPVKWIA1ESLADRVYTSKSDW/AFG 
VTM VVT^IATRGMTP YPG VQNHEMYD YLLHG 
HRLKOPEDCLDELCKI**SPOSP 


679 


2029 


A 


5190 


39 


499 


■RESQVKHFKMRKIDLCLSSEGSEVILATCSUb 
KHPPENUDGNPETFWTTTGMFPQEFIICFHKH 
VRIERLVIQSYFVQTLKIEKSTSKEPVDFEQWI 
EKDLVHTEGQLQNEEIVAHDGSATYLRFIIVS 
AFDHF AS VHS VS AEGTWSNLS S 


680 


2030 | 


A 


5204 


541 


92 


" FILAVLKLACGDISLNALALMVATAVLTLAPL 
LLICLSYLF1LSAILRVPSAAGRCKAFSTCSAH 
RTVVWFYGTISFMYFKPKAKDPNVDKTVAL 
FYGVVTPSLNPIIYSLRNAEVKAAVLTLLRGG 
LLSRKASHCYCCPLPLSAGIG 


681 


2031 


A 


5207 


10 


247 


VPDNGDVTKLPVCSTLVEETSLTVSEAMEQSI 
KNESPLPGTLAHTCNTSTLGGRGRWIT*GREF 
DTSMANMVKPCLYRK 


682 


2032 


A 


5210 


2 


231 


FFFETCSYSITQAGVQNVFNLSSLKTLPFUt*^ 
SCLSLPSSWDYRCLPPCPANFCIFSRNGVLPC 
WPGWSRTPDLS 


683 


2033 


A 


5218 


85 


402 


" CPSVSGLIKSDLRRHNINIG1TNVDVKAV SIN Lb 
MIILLRSMYRINVKPYFFI*LFFSRVNC* SVHG 
Y ARC YTFLIF*LFL* IP ADSPTDQEPKTVMLSK 
QSESAI 


684 


2034 


A 


5220 


1 


194 


" NLMKEMQNLNSENHKTWEE YKDTK* 1Mb Y t 
YG*ALNV1KMAVLPKLMYRFSATLVKIPQHL 


685 


2035 


A 


5228 


260 


440 

i 


LHSQDGNSDPRKPQGEMSAHAFPVQTCUbbU 
OKKTPO VPINFTELSKC S * S * KIMSGERE 


686 


2036 


A 


5239 


79 


508 


GGEAAARAAKLSSPRPHRVGRRERGVGGM5 
AFSEAALEKKLSELSNSQQSVQTLSLWLHHR 
KJ1SRPIVTVWERELRKAKPNTIKLTFLYLAND 
VIQNSKRKGPEFTKDFAPVIVEAFKHVSSETD 
ESCKKHLGRVLSIWEERS 


687 


2037 


A 


5244 


1 


428 


~ " MAA W AATALKGRG ARN ARVLRULLAU A l A 
NKASHNRTRALQSHSSPEGKEEPEPLSPELEY1 
PRKRGKNPMKAVGLAWA1GFPCGILLFILTKR 
EVDKDRVKQMKARQNMRLSNTGEYESQRFR 
ASSOSAPSPDVGSGVQT 


688 


2038 


A 


5249 


1 


1407 


' LQQTEDKSLLNQGSSSEEVAGSSQKMUQPUK 

SGDSDLATALHRLSLRRQNYLSEKQFFAEEW 

^nr/T^n a nnvcr.\/QnrvTPTF^I AST CTTOS 
QRKIQ VL AJu'vNiitJ * v 1 r 1 r " 3i ^ AVO 1 1 " 

ETTDLSSASCLRGFMPEKLQIVKPLEGSQTLY 
HWQQLAQPNLGTILDPRPGVTTKGFTQLPGD 
AI YH1 SDLEEDEEEGITFQVQQPLEVEEKLSTS 
KPVTGIFLPPITSAGGPVTVATANPGKCLSCT 
NSTFTFTTCRILHPSDITQVTPSSGFPSLSCGSS 
GSSSSNTAVNSPAI^YRLSIGESITNRRDSTrT 
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SEQID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 
beginning 

nnrlpntirli* 

location 
correspondi 
ng to first 
amino acid 
residue of 

n^ntide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alamne C=Cysteine, 
D=Aspartic Acid, E=G!utamic Acid, 
^Phenylalanine, G=Glycine, H=Histidinc, 
l=Isoleucine, K-Lysine, L=Leucinc, 
M=Methionine, N=Asparagine, P=Proline, 
Q==Glutarnine, R=Arginine, S=Serine, 
T=Threonine, V- Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, V«=possible 
nucleotide insertion 














FSSTMSLAKLLQERGISAK\ r YHSPISENPLQPL 

PKSLAIPSTPPNSPSHSPCPSPLPFEPRVHLSEN 

FLASRPAETFLQEMYGLRPSRNPPDVGQLKM 

NLVDRLKRLGIARVVKNPGAQENGRCQEAE1 

GPQKPDSAVYLNSGSSLLGGLRRNQSLPVIM 

GSFAAPVCTSSPKMGVLKED 


689 


2039 


A 


5254 


2 


2621 


LSLFGSRALGRSGARAMAKAKKVGARRKAS 

GAPAGARGGPAKANSNPFEVKVNRQKFQILG 

RKTRHDVGLPGVSRARALRKRTQTLLKEYKE 

RDKSNVFRDKJIFGEYNSNMSPEEK^IMKRFA 

LEQQRHHEKKSIYNLNEDEELTHYGQSLADIE 

KHNDIVDSDSDAEDRGTLSGELTAAHFGGGG 

GLLHKKTQQEGEEREKPKSRKELIEELIAKSK 

QEKRERQAQREDALELTEKLDQDWKEIQTLL 

SHKTPK SENRDKKEKPKPD AYDMMXTIELGF 

EMKAQPSNRMKTEAELAKEEQEHLRKLEAE 

RLRRMLGKDEDENVKKPKHMSADDLNDGFV 

LDKDDRRLL S YKDGKMN VEED VQEEQ SKEA 

SDPESNEEEGDS SGGEDTEESDSPDSHLDLES 

NVESEEENEKPAKEQRQTPGKGLISGKERAG 

KATRDELPYTFAAPESYEELRSLLLGRSMEEQ 

LLVVERIQKCNHPSLAEGNKAKLEKLFGFLLE 

YVGDL ATD DPPDLTV IDKL WHL YFfl^CQMFP 

ESASDAIKFVLRDAMHEMEEMIETKGRAALP 

GLDVLIYLKITGLLFPTSDFWHPWTPALVCL 

SQLLTKCPILSLQDWKGLFVCCLFLEYVALS 

QRFIPELINFLLGILY1ATPNKASQGSTLVHPFR 

ALGKNSELLVVSAREDVATWQQSSLSLRWA 

SRLRAPTSTEANHIRLSCLAVGLALLKRCVLM 

YGSLPSFHAIMGPLRALLTDHLADCSHPQELQ 

ELCQSTLTEMESQKQLCRPLTCEKSKPVPLKL 

FTPRLVKVLEFGRKQGSSICEEQERKRLIHKHK 

REFKGAVREIRKDNQFLARMQLSE1MERDAE 

RXRKVKQLFNSLATQEGEWKALKRJCKFKK 


690 


2040 


A 


5261 


1 


304 


FFFFVFLVETGFHHVGQAGLELLTSGDPPTW 
ASQSAGITGVSHCSWPVIYVLSTLLHAVRNVL 
FKRTFPLKSSSFLSYDKE1FPILIVLKFYLVTLT 
SFVK 


691 


2041 


A 


5270 


3 


158 


NCHTTHCTANWVHLPGTPPGWKIDGPAAAL 
EVLS SFFFFFLKFSYKPQNIV 


692 


2042 


A 


5282 


56 


1268 


GMEPVGCCGECRGSSVDPRSTFVLSNLAEW 

ERVLTFLPAKALLRVACVCRLWRECVRRVLR 

THRSVTWISAGLAEAGHLEGHCLVRWAEEL 

ENVR1LPHTVLYMADSETFISLEECRGHKRAR 

KRTSMETALALEKLFPKQCQVLGIVTPGIWT 

PMGSGSNRPQEIEIGESGFALLFPQffiGKIQPF 

HFIKDPKNLTLERHQLTEVGLLDNPELRWLV 

FGYNCCKVGASNYLQQWSTFSDMNnLAGG 

QVDNLSSLTSEKNPLDIDASGVVGLSFSGHRI 

QSATVLLNEDVSDEKTAEAAMQRLKAANIPE 

IINTIGFMFACVGRGFQYYRAKGNVEADAFR 

KFFPSVPLFGFFGNGEIGCDRTVTGNFILRKCN 

EVKDDDLFHSY 1 llMALIHLUbbis. 


693 


2043 


A 


5301 


362 


507 


" EEIKERFGPGLVTYWYGFIQELDCNRERG1LLK 
ACFPTNIVTLCHSIA 


694 


2044 


A 


5310 


1 1 

! 


204 


RVLTAINHTLKENLRKFYKGKKDKPLDLRPK 
KTRAMRRRLNMHEENLKTKKQHRKERLYPL 
RKYAAJCA 


695 


2045 


A 


5315 


125 


1596 


" ETRSTAVKSEVQVCISLLLCLEDRTMPKKAKP 
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SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID j 1 
NO: of I 
peptide 
seq- 
uence 


Met 1 
hod j 


SEQ | 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 
beginning 

[JUCICUULIC 

location 
correspondi 
ng to first 
amino acid 
residue of 
peptide 
sequence 


Predicted end 1 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alamne OCysteme, 
D=Aspartic Acid, E-Glutamic Acid, 
^Phenylalanine, Glycine, H-Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q-Glutamine, R=Arginine, S=Serine, 
T-Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X-Unknown, *=Stop codon, 
/^possible nucleotide deletion, V=possible 
nucleotide insertion 














TGSGKEEGPAPCKQMKLEAAGGPSALNh DSF 

SSLFESLISPHCTETFFKEFWEQKPLLIQRDDPA 

LATYYGSLFKLTDLKSLCSRGMYYGRDVNV 

CRCVN GKJCK VLNKDGKAHFLQLRKDFDQKR 

AT1QFHQPQRFKDELWRIQEKLECYFGSLVGS 

NVYITPAGSQGLPPHYDDVEVFILQLEGEKH 

WRL YHFFVPL ARE Y SVEAEHRJ GRP VHEFML 

KPGDLLYFPRGTIHQADTPAGLAHSTHVTIST 

YQNNSWGDFLLDTISGLVFDTAKEDVELRTG 

IPRQLLLQVESTTVATRRLSGFLRTLADRLEG 

TKELLSSDMKKDF1MHRLPPYSAGDGAELSTP 

GGKLPRLDSVVRLQFKDH1VLTVLPDQDQSD 

ETQEKMVYIYHSLKNSRETHMMGNEEETEFH 

GLRFPLSHLDALKQIWNSPAISVKDLKLTTDE 

EKESLVLSLWTECLIQW 


696 


2046 


A 


5318 


1476 


742 


LMKXYLEAAELGEISDIHTKLLRLSSSQGTIET 

SLQDIDSRLSPGGSLADAWAHQEGTHPKDRN 

VEKLQVLLNCMTEIYYQFKKDKAERRLAYN 

EEQIHKPDKQKLYYHATKAMTHFTDECVKK 

YEAFLNKSEEWIRKMLHLRKQLLSLTNQCFDI 

EEE VSKYQE YTNELQETLPQKMFTAS SGHCHT 

MTPIYPSSNTLVEMTLGMIOCLKEEMEGVVKE 

LAENNHILESGGSLTMDGGLRNVDCL 


697 


2047 


A 


5320 


244 




" LDYNFFLFEMTFGLVSQAGVQWHDLGSLQPP 
PPGFKQFSCLSLPSSWDYRHLPPHLANFSREG 
VSPSWPGWSRTPDFR 


698 


2048 


A 


5324 


266 


714 


LPIRKSLRSVRSGFPTSQSPITRNLDGTASUSe: 
LAKTVTGSLFRINVGLRGLVAGGIIGALLGTP 
VGGLLMAFQKYSGETVQERKQKDRKALHEL 
KLEEWKGRLQVTEHLPEKIESSLQEDEPENDA 
K KIEALLNLPRNPS VIDKQDKD 


699 


2049 


A 


5334 


699 


277 


RPHGHLVCISSSAGLSGVNGLADYCASKr AA 

FGFAESVFVETFVQKQKGIKTTIVCPFFIKTGM 

FEGCTTGCPSLLPILEPKYAVEKIVEAILQEKM 

YLYMPKLLYFMMFLKSFLPLKTGLLIADYLGI 

LHAMDGFADQKK 


700 


2050 


A 


5344 


3 


614 


PTAEEMSSLTTESSPELAKRSWFGNFISLDKJifc 
QIFLVLKDKPLSSIKADIVHAFLSIPSLSHSVLS 
QTSFRAEYKASGGPSVFQKPVRFQVDISSSEG 
PEPSPRRDGSGGGGIYSVTFTLISGPSRRFKRV 
VETIQAQLLSTHDQPSVQALADEKNGAQTRP 
AGAPPRSLQPPPGRPDPELSSSPRRGPPKDKK 

LLATNGTPL 


701 

702 


2051 
2052 


A 
A 


5346 
5356 


3 

2502 


1383 
1540 


" HASVXFCRVMAASKTQGAVARMQEDRDCiSC 
STVGGVGYGDSKDCILEPLSLPESPGGTTTLE 
GSPSVPCIFCEEHFPVAEQDKLLKHMI1EHKIV 
LADVXLVADFQRYILYWRKRFTEQPITDFCSV 
DUNSTAPFEEQENYFLLCDVLPEDRILREELQ 
KQRLREILEQQQQERNDTNFHGVCMFCNEEF 
LGNRSV1LNHMAREHAFNIGLPDNIVNCNEFL 
CTLQKJaDNLQCLYCEKTFRDICNTLKDHMR 
KKQHRKINPKNREYDRFYVFNYLELGKSWEE 
\ir\i rrr^r^Dtn i nunpnnw^nWEFHPASAVCL 
FCEKQAETIEKLYVHMEDAHEFDLLKIKSELG 
LNFYQQVKLVNFIRRQVHQCRCYGCHVKFKS 
KADLRTHMEETKHTSLLPDRKTWDQLEYYFP 
TYENDTLLWTLSDSESDLTAQEQNENVPIISE 
DTSKLYALK.QSSELNQLLL 

" MAAATRGCRPWGSLLULLGLVSAAAAAWU 
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SEQID 5 
NO: of I 
nucl- I 
eotide J 
seq- ' 
uencc 


>EQ ID f 
nJO: of t 
peptide 
>eq- 
jencc 


vlet J 
lod I 
i 
1 
( 


SEQ 1 
PNO: 1 
n i 

JSSN 
39/4% 


Predicted 1 

beginning 

lucleotide 

ocation 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
ocation 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


/Vraino acid sequence (A=Alamne C=Cysteine, 
D=Aspartic Acid, E-Glutarnic Acid, 
^Phenylalanine, G=Glycine, H=Histidine, 
I=lsoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threomne, V- Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 














LASLRCTLGAFCECDFRPDLPGLECDLAgHi. 

AGQH1AKALVVKALJCAFVRDPAPTKPLVLSL 

HG WTGTGKS YVS SLLAHYLFQGGLRSPRVH 

HFSPVLHFPHPSHIERYKKDLKSWVQGNLTA 

CGRSLFUTDEMDKMPPGLMEVLRPFLGSSWV 

VYGTN YRKAIFIF1SNTGGEQINQ V ALEA WRS 

RRDREEILLQELEPV1SRAVLDNPHHGFSNSGI 

MEERLLDAWPFLPLQRHHVRHCVLNELAQL 

GLEPRDEWQAVLDSTTFFPEDEQLFSSNGCK 

TVASR1AFFL 


703 


2053 


A 


5380 


278 


657 


LFLQKiiOvlKTEEEARmmiEMFLRKbggKX 
EERLEFWMEKYDKDTEMKQNELNALKATKA 
SDLAHLQDLAKMIREYEQVIIEDRIEKERSKX 
KVKQDLLELKSVTKLQAWWRGTMIRREIGGF 

KM 


704 


2054 


A 


5381 


1 


1003 


FRGRAVKMAAVVEVE VGGG AAGERELU t v 

DMSDLSPEEQWRVEHARMHAKHRGHEAMH 

AEMVLILIATLWAQLLLVQWKQRHPRSYN 

vmTi FOMWVWLYFTVKLHWWRFLVIWILF 

SAVTAFVTFRATRKPLVQTTPRLVYKWFLLIY 

K1SYATGIVGYMAVMFTLFGLNLLFKIKPEDA 

MDFGISLLFYGLYYGVLERDFAEMCADYMA 

STIGFYSESGMPTKHLSDSVCAVCGQQIFVDV 

SEEGIIENTYRLSCNHVFHEFCIRGWCIVGKK 

QTCPYCKEKVDLKRMFSNPWERPHVMYGQL 

LDWLRYLVAWQPVIIGWOGINYILGLE 


705 


2055 


A 


5396 


3 


675 


TVT)R DPI .OLATRAGOPLDINMAGEPKP YRPK^ 

GNKRPLSALYRLESKEPFLSVGGYVFDYDYY 

R DDF YNRLFD YHGRVPPPPRA VIPLFCRPRV A 

VTTTRRGKGVFSMKGGSRSTASGSTGSKLKS 

DELQTIKKELTQIKTKIDSVLGRLDKJEKQQK 

AEAEAQKKLLEESLVLIQEECVSEIADHSTEEP 

AEGGPDADGEEMTDGIEEAFDEDGGHELFLQ 

DC 


706 


2056 


A 


5410 


2 




- GRVGLNLHGRGCSEPKWRHCTPTWA ifc^^i 
S 


707 


2057 


A 


5415 


6 


287 


' PFKLTPSFLSHAFSSGQERKVFIELNHIKKCNl 
VRGVFVLEEFGNYTILLLGLDSHGSNSNLGAP 
EEGLGAGRKRTSVEKSGGAGVTRKKRDP 


708 


2058 


A 


5423 


3 


291 


SSSNPLGSPSTLWKLCSFVLHNKSCCCSWUS 

TPTUIAITLTVRVCGFIPEVSKTTNPLGRTNNS 

GCTIHCTVTLTARSTASLLKSVRPRTHQKE 


709 


2059 


A 


5424 


679 


347 


" RIRHEEKRGSRGRGRRTSEED 1 PKKKKHKGG 
SEFTOTILSVHPSDVLDMPVDPNEPTYCLCHQ 
VSYGEM1GCDNPDCPIEWFHFACVDLTTKPK 
GKWFCPRCVQEKRKKK 


710 


2060 

i 


A 


5442 


1073 


559 


" QESLKXKlQPKLSLTLSSSVSRGNVSTPPRHbb 
GSLTPPVTPPITPSSSFRSSTPTGSEYDEEEVDY 
EESDSDESWTTESA1SSEA1LSSMCMNGGEEK 
PFACPVPGCKKRYKNVNGDCYHAKNGHRTQI 
RVRKPFKCRCGKSYKTAQGLRHHTINFHPPV 

SAEIIRKMQQ 
— rnci ^rpnvMWRFFT? VTT FI JCMASGHA1*QP 


711 


2061 


A 


5449 


1 


319 


DL VKRIRDAJRMGLS ARHVPSLILETKG IPYTL 
NGKKVEVAVKQIIAGKAVEQGGAFSNPETLD 

LYRDIPELQGF 


712 


2062 

i 


A 


5499 


91 


749 ■ 


" RPTPGHGDFWMQPLTKDAGMSLSSVILAbAL 
QVRGEALSEEEIWSLLFLAAEQLLEDLRNDSS 
DYWCPWSALLSAAGSLSFQGRVSHIEAAPF 
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NO: of 3 
nucl- I 
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seq- ! 
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;eq ID 1 
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seq- 
uence 


vlet 5 
lod 1 
i 


SEQ I 
DNO: 1 
n i 
USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

ocation 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
[ocation 
corresponding 
to last amino 
acid residue 

OI pcpuuv 

sequence 


Amino acid sequence (A=Alanine C-Cysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
l=Isoleucine, K=Lysine, L=Leucine, 
M-Methionine, N=Asparagine, P=Proline, 
Q^Glutamine, R=Arginine, S= Serine, 
T=Threonine, V-Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 














KAPELLQGQSEDEQPD ASQMHVYSLGM 1 L Y 
WSAGFHVPPHQPLQLCEPLHSILLTMCEDQPH 
RRCTLQSVLEACRVHEKEVSVYPAPAGLHIR 
RLVGLVLGTISEVSREPCFSSSSCWSCVAIKI 


713 


2063 


A 


5506 


22 


4 Jo 


VEELILVSRLDPHLHTPMYFFLAHLSFLlJLb^ 1 
TSSrPQLLYNLNGCDKTISYMGCAIQLFLFLGL 
GGVECLLLAVMAYDRCVAICKPLHYMVIMN 
PRLCRGLVSVTWGCGVANSLAMSPVTLRLPR 
CGHHEVDHFLCEMPALIRMACISTV 


714 


2064 


A 


5514 


25 


220 


AIRPYWCENNIIGIGKLSTADGKAFAJJFtVi-K 
RLTSSVSCALDEAAAALTRMRAESTANAGQS 


715 


2065 


A 


5526 


3 


810 


" KVTAPRRPQRYSSGHGSDNSSVLSGELrTAM 
rjRTAT FHHSOGSSGYESLRRDSEATGSASSAP 
DSMSESGAASPGARTRSLKSPKKRATGLQRR 
RLIPAPLPDTTALGRKPSLPGQWVDLPPPLAG 
SLKEPFE1KVYEIDDVERLQRPRPTPREAPTQG 
i~ APVSTRLRLAERROORLREVQAKHKHLCEE 
LAETQGRLMLEPGRWLEQFEVDPELEPESAE 
YLAALERATAALEQCVNXCKArWMMVTCFD 
ISVAASAAIPGPQEVDV 


716 


2066 


A 


5529 


458 


790 


SPGYGENKFTVTSXNI A VPLCEMNKJ Y S Y Y 
<WSFRTMDLVLEMCNTNSIHWCGISGRQLG 
KLHPSSSLCLALTLLSSVQGLQSISGLRLTDTF 
LKRTYEYDDIAQVCV 


717 


2067 


A 


5531 


3 


460 


WSFm LKYFNPESWOEDLDNMYLDTPRYKCi 
RS YHDRKS K. VDLDRLNDD AKRYSCTPRNYS 
VNIREELKLANWFFPRCLLVQRCGGNCGCG 
TVNWRSCTCNSGKTVKKYHEVLQFEPGHIKR 
RGRAKTMALVDIQLDHHERCDCICSSRPPR 


718 


2068 


A 


5586 


311 


88 


^VLKNMAPMTALGLLDLHlLNLILFLSACjtUh 
TSWSEIMMmLVFLTLWLLIEMIYCYRKVS 

KAEEAAQENA 


719 


2069 


A 


5598 


1 


330 


" KNCANEAWQKILDRVLSRYDVRLRPNh ubM 
LATNSTRGLNEDELMAHGQEKDSSSESEDSC 
PPSPGCSFTEGFSFDLLNPDYVPKVDKWSRFL 
FPLAFGLFNIVAAERC 


720 


2070 


A 


5628 


798 


148 


" L ppAQIPEAWLLLANVVN^VLILVPLKDRl.UJr 
LLLRCKLLPSALQKMALGMFFGFTSVIVAGV 
LEMERLHY1HHNETVSQQIGEVLYNAAPLSIW 
WQ1PQYLLIGISEIFASIPGLEFAYSEAPRSMQG 
AIMGIFFCLSGVGSLLGSSLVALLSLPGGWLH 
CPKDFGNINNCRMDLYFFLLAGIQAVTALLF 
VWIAGRYERASQGPASHSRFSRDRG 


721 


2071 


A 


5632 


146 


536 


" " MSALIVRKLRSAELTLFSELPTVLGANVNAA 
KLHETALHHAAKVKNVDLlEMLDEFGGNrYA 
RDNRGKKPSDYTWSSSAPAKCFEYYEKTPLT 
LSQLCRVNLRKATGVRGLEKIAKLNIPPRLCD 
YT ^vw 


722 


2072 


A 


5638 


3 


3806 


" CPSLDIRSEVAELRQLENCSWEGHLQILLM^ 
TATGEDFRGLSFPRLTQVTDYLLLFRVYGLES 
LRDLFPNLAV1RGTRLFLGYALVIFEMPHLRD 

VALPALGAVLRGAVR VfcKJVC^fci-om-a x u-> w 

GLLQPAPGANHIVGNKLGEECADVCPGVLGA 

AGEPCAKTTFSGHTDYRCWTSSHCQRVCPCP 

HGMACTARGECCHTECLGGCSQPEDPRACV 

ACRHLYFQGACLWACPPGTYQYESWRCVTA 

ERCASLHSVPGRASTFGIHQGSCLAQCPSGFT 

RN SSSIFCHKCEGLCPKECKV GTKT1DS1Q AA 
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SEQID t 
NO; of 1 
nucl- I 
eotide 
seq- 
uence 


>EQ ID I 
^0: of 1 
peptide 
seq- 
uence 


vlet j I 
lod 1 


SEQ I 

DNO: \ 
n 

USSN 
39/496 
914 


Predicted J 
beginning 
mcl eotide 
ocation 
correspondi 
ng to first 
amino acid 
residue of 
peptide 
sequence \ 


Predicted end 
nucleotide 
ocation 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A-Alamne L-Cysteme, 
D=Aspartic Acid, E«Glutamic Acid, 
^Phenylalanine, G=01ycine, H=Histidine, 
[=lsoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Argimne, S=Serine, 
T-Threonine, V=Valine, W=Tryptophan, 
Y-Tyrosine, X=Unknown, *-Stop codon, 
/-possible nucleotide deletion, \=possible 
nucleotide insertion 














ODLVGCTHVEGSLlLNLRQGYNLEPQLgHSL 

GLVETITGFLK1KHSFALVSLGFFKNLKLIRGD 

AMVDGNYTLYVLDNQNLQQLGSWVAAGLTI 

PVGKIYFAFNPRLCLEHIYRLEEVTGTRGRQN 

KAEINPRTNGDRAACQTRTLRF V SNVTEADRI 

LLRWERYEPLEARDLLSF1VYYKESPFQNATE 

HVGPDACGTQSWNLLDVELPLSRTQEPGVTL 

ASLKP WTQ YA VF VRAITLTTEEDSPHQG AQS 

PIVYLRTLPAAPTVPQDVISTSNSSSHLLVRW 

KPPTQRNGNLTYYLVLWQRLAEDGDLYLND 

YCHRGLRLPTSNNDPRFDGEDGDPEAEMESD 

CCPCQHPPPGQVLPPLEAQEASFQKKFENFLH 

NATTIPISPWKVTSINKSPQRDSGRHRRAAGPL 

RLGGNSSDFEIQEDKVPRERAVLSGLRHFTEY 

RIDIHACNHAAHTVGCSAATFVFARTMPHRE 

ADGIPGKVAWEASSKNSVLLRWLEPPDPNGL 

ILKYEIKYRRLGEEATVLCVSRLRYAKFGGV 

HLALLPPGNYSARVRATSLAGNGSWTDSVAF 

YILGPEEEDAGGLHVLLTATPVGLTLLIVLAA 

LGFFTGKJCRNRTLYASVNPEYFSASDMYVPD 

EV/E VPREQISIIRELGQG SFGMVYEGLARGLE 

AGEESTPVALKTVNELASPRECIEFLKEASVM 

KAFKCHHVVRLLGWSQGQPTLVIMELMTR 

GDLICSHLRSLRPEAENNPGLPQPALGEM1QM 

AGEIADGMAYLAANKFVHRDLAARNCMVSQ 

DFTVKJ GDFGMTRD V YETD YYRKGGKGLLP 

VRWMAPESLKDG1FTTHSDVWSFGVVLWEIV 

TLAEQPYQGLSNEQVLKFVMDGGVLEELEGC 

PLQLQELMSRCWQPNPRLRPSFTHILDSIQEEL 

RPSFRLLSFYYSPECRGARGSLPTTDAEPDSSP 

TPRDCSPQNGGPGH 


723 


2073 


A 


5672 


1 


216 


LAWIDNILPEKJEKXEI'DKKRKRKKGAH^UUU 
EEPQFPPPSVTKIPMESVQSDPQNGIHC1ARKR 

SSSWSYSL 


724 


2074 


A 


5704 


4235 


940 


ARGRRSRPVWAASWGGRGRPAARRKFK.ULA 

ATMGFELDRFDGDVDPDLKCALCHKVLEDP 

LTTPCGHVFCAGCVLPWWQEGSCPARCRGR 

LSAKELNHVLPLKRLILKLDIKCAYATRGCGR 

WKLQQLPEHLERCDFAPARCRHAGCGQVLL 

RRDVEAHMRDACDARPVGRCQEGCGLPLTH 

GEQRAGGHCCARALRAHNGALQARLGALHK 

ALKKEALRAGKREKSLVAQLAAAQLELQMT 

ALRYQKKFTEYSARLDSLSRCVAAPPGGKGE 

ETKSLTLVLHRDSGSLGFNIIGGRPSVDNHDG 

SSSEGIFVSKIVDSGPAAKEGGLQIHDRIIEVN 

GRDLSRATHDQAVEAFKTAKEPIVVQVLRRT 

PRTKMFTPPSESQLVDTGTQTDJTFEHIMALT 

KMS SPSPPVLDP YLLPEEHPS AHE YYDPNDYI 

GDIHQEMDREELELEEVDLYRMNSQDKLGLT 

VCYRTDDEDDIGIYISEIDPNSIAAKDGRIREG 

DRIIQINGIEVQNREEAVALLTSEENKNFSLLI 

ARAELQLDEG WMDDDRNDFLDDLHMDMLE 

EOHHQAMQFTASVLQQKJCHDEDGGTTDTAT 

TT QNOHEKDSGVGRTDESTRNDESSEQENNG 

DDATASSNPLAGQRKLTCSQDTLGSGDLPFS 

NESFISADCTDADYLGIPVDECERFRELLELK 

CQVKSATPYGLYYPSGPLDAGKSDPESVDKE 

LELLNEELRSIELECLSIVRAHKMQQLKEQYR 

ESWMLHNSGFRNYNTSIDVRRHELSDITELPE 

KSDKDSSSAYNTGESCRSTPLTLEISPDNSLRR 
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seq- i 
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i 
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>EQ I 
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n i 
JSSN , 
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beginning 

lucleotidc 

ocation 

^orrespondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
ocation 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Srmno acid sequence (A-Aianme C=Cysteine, 
>=Aspartic Acid, E=Glutamic Acid, 
^Phenylalanine, OOlycine, H-Histidine, 
[=lsoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W^Tryptophan, 
Y=Tyrosine, X=Unknown, *«Stop codon, 
/^possible nucleotide deletion, \=possiblc 
nucleotide insertion 








i 






AAEGISCPSSEGAVGTTEA YGPASKNLLSl I b 

DPEVGTPTYSPSLKELDPNQPLESKERRASDG 

SRSPTPSQKLGSAYLPSYHHSPYKHAHIPAHA 

OHYQSYMQLIQQKSAVEYAQSQMSLVSMCK 

DLS SPTP SEPRMEWKVKIRSDGTRYITKRP VR 

DRLLRERALKIREERSGMTTDDDAVSEMKM 

GRYWSK£ERKQHLVKAK£QRRRREFMMQSR 

LDCLKEQQAADDRXEMNILELSHKKMMKKR 

NKKIFDNWMTIQELLTHGTKSPDGTRVYNSF 

T SVTTV 


725 


2075 


A 


5707 


3 


1770 


" OISTEVSEAPVANDKPKTLVVKVQKKAADLP 
DRDTWKGRFDFLMSCVGYA1GLGNVWRFPY 
LCGKNGGGAFLIPYFLTLIFAGVPLFLLECSLG 
OYTSIGGLGVWKLAPMFKGVGLAAAVLSFW 
LN1YY1V1ISWAIYYLYNSFTTTLPWKQCDNP 
VraTDRCFSOTSMVNTTNNTTSAVVEFWERN 
MHQMTDGLDKPGQIRWPLAITLAIAWILVYF 
CIWKGVGWTGKWYFSATYPY1ML11LFFRGV 
TLPGAKEGILFY1TPNFRKLSDSEVWLDAATQ 
IFFSYGLGLGSLIALGSY>JSFHNNVYRDSIIVC 
CINSCTSMFAGFVTFSWGFMAHVTKRS1ADV 
AASGPGLAFLAYPEAVTQLPISPLWAILFFSM 
LLMLGIDSQFCTVEGFITALVDEYPRIXRNRR 
ELF1AAVCUSYLIGLSNITQGG1YVFKXFDYYS 
ASGMSLLFLVFFEC V SI S WF YG VNRF YDN IQE 
MVGSRPCIWWKLCWSFFTPUVAGVFIFSAVQ 
MTPLTMGNYVFPKWGQGVGWLMALSSMVL 
IPGYMAYMFLTLKGSLKQR1QVMVQPSEDIV 
RPFNGPEOPOAGSSTSKEAYI 


726 


2076 


A 


5711 


156 


423 


PRRDPGRTPELRGSAPRKTGANMPVRRCiHVA 

PQNTFLGTICRKFEGQNKKF11ANARVQNCAU 

YCNDGFCEMTGFSRPDVMOKPCTCD 


727 


2077 


A 


5716 


3 


274 


HASE YFFKLC SFQ VFLSFPL ATIV1D V OL V V if 
LVKJSPNVHYVYVLLLVLSGLLFYIPLTHFKIRL 
AWFEKMTCYLQLLFNICLPDVSEE 


728 


2078 


A 


5737 


1899 


649 


lOASRASPYPRVKVDFALSCHEDLLAPlSbPlli 

WKYHSPEEEISLGPACWLWDFLRRSQQAGFL 

LPLSGGVDSAATACLIYSMCCQVCEAVRSGN 

EEVLADVRTTVNQISYTPQDPRDLCGRILTTC 

YMASKNSSQETCTRARELAQQIGSHHISLNID 

PAVKAVMGIFSLVTGKSPLFAAHGGSSRENL 

ALQNVQARIRMVLAYLFAQLSLWSRGVHGG 

LLVLGSANVDESLLGYLTKYDCSSADINPIGG 

ISKTDLRAFVQFCIQRFQLPALQSILLAPATAE 

LEPLADGQVSQTDEEDMGMTYAELSVYGKL 

RKVAKMGPYSMFCKXLGMWRHICTPRQVAD 

KXOCRFFSKYSMNRHKMTTLTPAYHAENYSPE 

DNRFDLRPFLYNTSWPWQFRCIENQVLQLER 

AEPQSLDGVD 


729 


2079 


A 

! 


5741 


1 


5976 


" 'pGCAARLSRARAPGPGAAGAGRKRLAUPOPP 
PASRRLRAPGSRPRLAPCTRRAAQPAHARMA 
PRAAGGAPLSARAAAASPPPFQTPPRCPVPLL 
t t t t t r.AAR AHA! PIORRFPSPTPTNNFALDG 
AAGTVYLAAVNRLYQLSGANLSLEAEAAVG 
PWDSPLCHAPQLPQASCEHPRRLTDNYNKJL 
QLDPGQGLVWCGSIYQGFCQLRRRGNISAV 
AVRFPPAAPPAEPVTVFPSMLNVAANHPNAS 
TVGL VLPP AAG AGG SRLLVG AT YTGYGSSFF 
PRNRSLEDFDU^ENTPEIAIRSLDTRGDLAKLFT 
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Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 
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T=Threonine, V= Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \-possiblc 
nucleotide insertion 














FDLNPSDDNILKIKQGAKEQHKLGFVS.AFLHP 

SDPPPGAQSYAYLALNSEARAGDKESQARSL 

LAR1CLPHGAGGDAKKLTESYIQLGLQCAGG 

AGRGDLYSRLVSVFPARERLFAVFERPQGSPA 

ARAAPAALCAFRFADVRAAIRAARTACFVEP 

APDWAVLDSWQGTGPACERKLNIQLQPEQ 

LDCG AAHLQHPLSILQPLKATPVFRAPGLTS V 

A V A S VNN YT A VFLGTVNGRLLKrNLNES MQ 

WSRRWTVAYGEPVHHVMQFDPADSGYLY 

LMTSHQMARVKVAACNVHSTCGDCVGAAD 

AYCGWCALETRCTLQQDCTNSSQQHFWTSA 

SEGPSRCPAMTVLPSEIDVRQEYPGMILQISGS 

LPSLSGMEMACDYGNN1RTVARVPGPAFGHQ 

IAYCNLLPRDQFPPFPPNQDHVTVEMSVRVN 

GRNIVKANFTIYDC SRTAQVYPHTACTSCLS A 

QWPCFWCSQQHSCVSNQSRCEASPNPTSPQD 

CPRTLLSPLAPVPTGGSQN1LVPLANTAFFQG 

AALECSFGLEEIFEAVWVNESVVRCDQVVLH 

TTRKSQVFPLSLQLKGRPARFLDSPEPMTVM 

VYNCAMGSPDCSQCLGREDLGHLCMWSDGC 

RLRGPLQPMAGTCPAPEIRAIEPLSGPLDGGT 

LLTIRGRNLGRRLSDVAHGVWIGGVACEPLP 

DRYTVSEEIVCVTGPAPGPLSGWTVNASKE 

GKSRDRFSYVLPLVHSLEPTMGPKAGGTRITI 

HGNDLHVGSELQVXVNDTDPCTELMRTDTSI 

ACTMPEGALPAPVPVCVRFERRGCVHGNLTF 

WYMQNPVITAISPRRSPVSGGRTITVAGERFH 

MVQNVSMAVHfflGREPTLCKVLNSTLITCPSP 

GALSNASAPVDFFINGRAYADEVAVAEELLD 

PEEAQRGSRFRLDYLPNPQFSTAKREKW1KH 

HPGEPLTLVIHVSTKGAGKEQDSLGLQSHEY 

RVKJGQVSCDIQIVSDRUHCSVNE SLGAA VGQ 

LPITIQVGNFNQTIATLQLGGSETAIIVSIV1CSV 

LLLLSWALFVFCTKSRRAERYWQKTLLQME 

EMESQIREEIRKGFAELQTDMTDLTKELNRSQ 

GIPFLEYKHFVTRTFFPKCSSLYEERYVLPSQT 

LNSQG S S Q AQETHPLLGE WKIPESCRPNMEE 

GISLFSSLLDNKHFLIVFVHALEQQKDFAVRD 

RCSLASLLT1ALHGKLEYYTSIMKELLVDLID 

ASAAKNPKLMLRRTESWEKMLTNWMSICM 

YSCLRETVGEPFFLLLCAKQQINKGSIDAITG 

ICARYTLNEEWLLRENEAKPRNLNVSFQGCG 

MDSLSVRAMDTDTLTQVKEKILEAFCKNVPY 

SQWPRAEDVDLEWFASSTQSYILRDLDDTSV 

VEDGRKKLNTLAHYKIPEGASLAMSLIDKKD 

NTLGRVKDLDTEKYFHLYLPTDELAEPKKSH 

RQSHRKKVLPEIYLTRLLSTKGTLQKFLDDLF 

KAILS IREDKPPLAVKYFFDFLEEQAEKRGISD 

PDTLHIWKTNSLPLRFWVNILKNPQFVFDLDK 

TDHIDACLSVIAQAFIDACSISDLQLGKDSPTN 

KLLYAKEIPEYRKIVQRYYKQIQDMTPLSEQE 

MNAHLAEESRKYQNEFNTNVAMAEIYKYAK 

RYRPQIMAALEANPTARRTQLQHKFEQVVAL 

MEDNTYECYSEA 


730 


2080 


A 


5744 


3 


292 


4 QPSPLFHSHLETLQLLRTAQLPEQVSWPWGQ 
V ANGKGNQRNMGSPQPSLLAFERNLELQrMG 
LGYSLLMGKLRPRVAKDTLRVHRDSTPSPLT 
LKD 


731 


2081 


A 


5747 


1 


382 


" FLKCMRKAFRS SKLLQ VGYTPDGKDD YRWC 
FRVDEVNWTTWNTNVGIINEDPGNCEGVKRT 
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Y=Tyrosine, X= Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 














LSFSLRSSRVSGRHWKNh'ALVPLLREASAKi; 
RQSAQPEEVYLRQFSGSLKPEDAEVFKSPAAS 

r,EK 


732 


2082 


A 


5753 


198 


3 


AQ AE S STV A S PEAT AGPLCTRiPNVPPPTPIRP 
PGKLOAOLPCPSPVRFTSARIPPASRPQTKS 


733 


2083 


A 


5754 


2 


2223 


AAGPPGLEAEGRAPES AGPGPGGDAAE 1 PUL 

PPAHSGTLMMAFRDVTVQIANQNISVSSSTAL 

SVANCLGAQTVQAPAEPAAGKAEQGETSGR 

EAPEAPAVGREDASAEDSCAEAGASGAADG 

ATA PKTEEEEEEEETAEVGRG AEAEAGDLEQ 

LNRTSTSTKSAKSGSEASASASKDALQAMILS 

LPRYHCENPASCKSPTLSTDTLRKRLYR1GLN 

LFNINPDKG1QFLISRGFIPDTPIGVAHFLLQRK 

GLSRQMJGEFLGNSKKQFNRDVLDCWDEM 

DFSSMELDEALRKFQAHIRVQGEAQKVERUE 

AFSQRYCMCNPEVVQQFHNPDT1FILAFAIILL 

NTDMYSPNIKPDRKMMLEDFIRNLRGVDDG 

ADIPRELWGIYERIQQKELKSNEDHVTYVTK 

VEKSIVGMKTVLSVPHRRLVCCSRLFEVTDV 

NKLQKQAAHQREVFLFNDLLVILKLCPKKKS 

SSTYTFCKSVGLLGMQFQLFENEYYSHGITLV 

TPLSGSEKKQVLHFCALGSDEMQKFVEDLKE 

SJAEVTELEQIRIEWELEKQQGTKTLSFKPCGA 

QGDPQSKQGSPTAKREAALRERPAESTVEVSI 

HNRLQTSQHNSGLGAERGAPVPPPDLQPSPPR 

OQTPPLPPPPPTPPGTLVQCQQIVKVIVLDKPC 

LARMEPLLSQALSCYTSSSSDSCGSTPLGGPG 

SPVKVTHQPPLPPPPPPYNHPHQFCPPGSLLH 

GHRY^SnSRSLV 


734 


2084 


A 


5788 


8 


362 


S'SVMGDLVGQGLEEQIV ARDENSWL1DUU IP 
rDDVMRVLDIDEFPQSGNYETlGGFMMFMLR 
KIPKRTD S VKF AG YKFE WDIDNYRIDQLL VT 
RTDSKATALSPKLPDAKDKEESVA 


735 


2085 


A 


5827 


1 


1257 


■ M V FS A VLTAFHTGTS NTIFW YENTYMNTTL 
PPPFQHPDLSPLLRYSFETMAPTGLSSLTVNST 
A VPTTP AAFKSLNLPLQITLS AEMIFILF VSFLG 
NLVVCLMVYQKAAMRSAINILLASLAFADM 
LLAVLNMPFALVTILTTRWGKFFCRVSAMF 
FWLFVIEGVAILLIISIDRFLnVQRQDKLNPYR 
AK V LI A VS W ATSFC V AFPLAVGNPDLQIPSRA 
PQCVFGYTTNPGYQAYVILISLISFFIPFLVILY 
SFMGILNTLRHNALRIHSYPEGICLSQASKLGL 
MGLQRPFQMSIDMGFKTRAFTmiLFAVFIVC 
W APFTTYSLVATFSKHFYYQHNFFEISTWLL 
WLCYLKSALNPLIYYWRIKKFHDACLDMMP 
KSFKFLPOLPGHTKRJURPSAVYVCGEHRTVV 


736 


2086 


A 


5870 


3 


268 


" FTRSDELARHYRTHTGEKRFSCPLCPKQFSKiS 
DHLTKHARRHPTYHPDMIEYRGRRRTPRIDPP 
1TSEVESSASGSGPGPAPSFTTCL 


737 


2087 


A 


5871 


2 


521 


"TTWPQLFLETLPELLHMSKPAEDGPSPUALVK 
RSSSLGY1SKAEEYFLLKSRSDLMFEKQSERH 
GLARRLTTARRPPASSEQAQQELFNELKPAV 
DO ANF I VNTiMRDQNNYNEEKDS WNRV ART 
VDRLCLFVVTPVMWGTAWIFLQGVYNQPPP 
QPFPGDPYSYNVQDKRFI 


738 


2088 


A 


5881 


1 


1160 

| 


~ ' L V VTAITAIL AFPNE YTRMSTSELISELFNUCCi 
LLDSSKLCDYENRFNTSKGGELPDRPAGVGV 
Y S AM WQLJU^TLILKJVITIFTFGMKIP SGLFIPS 
MAVGA1AGRLLGVGMEQLAYYHQEWTVFNS 
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SEQID I 
NO: of 1 
nucl- 
eotide 
seq- 
uence 


5EQTD ! 
\ t O: of 1 
)eptide 
seq- 
uence 


v*et t ! 
lod 1 


5EQ 1 
[D NO: \ 
n 

USSN 
09/496 
914 


Predicted 
beginning 

nurlpotide 

ocation 
correspond i 
ng to first 
amino acid 
residue of 
peptide 
sequence 


Predicted end 
nucleotide 
ocation 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A-Alanine C-Cysteme, 
D=Aspartic Acid, &=Glutamic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
Msoleucine, K=Lysine, L=Leucine, 
M=Mcthionine, N=Asparagine, P=Prolinc, 
Q^jlutamine, R=Arginine, S= Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X- Unknown, *=Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 














WCSQGADCITPGLYAMVGAAACLCiUV 1KM1 

VSLVVIMFELTGGLEYlWLMAAAKfrSKWVA 

DALGREG1YDAHIRLNGYPFLEAKEEFAHKTL 

AMDVMKPRRNDPLLTVLTQDSMTVEDVETI1I 

SETTYSGFPVWSRESQRLVGFVLRRDLIISIE 

NARKKQDGVVSTSiriTTEHSPPLPPYTPPTLK 

LRNn.DLSPrTVTDLTPMEIVVDIFRKLGLRQC 

LVTHNGRIXGIITKKDVLKHIAQMANQDPDSI 

I FN 


739 


2089 


A 


5892 


2 


916 


TOIX^VP^AlSLISWWLPESARWLUNUKr 
DQALQELRKVARINGHKEAKNLTIEVLMSSV 
KEEVASAKEPRSVLDLFCVPVLRWRSCAMLV 
VNFSLL1SYYGLVFDLQSLGRDIFLLQALFGA 
VDFLGRATTALLLSFLGRRTIQAGSQAMAGL 
AILANMLVPQDLQTLRVVFAVLGKGCFGISL 
TCLTIYKAELFPTPVRMTADGILHTVGRLGA 
MMGPLILMSRQALPLLPPLLYGVISIASSLWL 
FFLPETQGLPLPDTIQDLESQKSTAAQGNRQE 
AFTVESTSLLEIVALHGAL 


740 


2090 


A 


5900 


2 


426 


^PIKTLGI GFHF S VDG V HFL'l QREVQNL W Ivb 
NLIILDTAKKHGYEWDTFTITMGRYKEFLQG 
KCGCHFHEWKSKLSKEYNnKMKRSRNHIM 
GRYFSNQSKLQQGTVTNFRSPYHVRGPINQV 
CSEILLSRMCANKRTM 


741 


2091 


A 


5910 


3 


412 


RMPESTLLIICENGYiLEAPLPTlKQEEUUHiJV 
VSYEIKDMCIKCFHFSSVKSKILRLIEIEKRER 
ORELKEKTREERRNKLAAEMGEDGEKEFQEE 
EEEKEEEEEEEEPLPEIFIPSTPSPrLCGFYSEPG 
K^wv 


742 


2092 


A 


5936 


1 


482 


MGCRLLCC WFCLLQ AGPLDTA VSQTFK Y L v 
TQMGNDKS1KCEQNLGHDTMYWYKQDSKK 
FLFaMFSYNNKELirNETVPNRFSPKSPDKAHL 
NLHIN SLELGDS A V YFC ASSQDTALQ SHCIPV 
HKPPGSARKLQGSVCTCTQGSSLHSLMASDG 

vpvp 


743 


2093 


A 


5938 


1 


1566 


■ MNSFFGTPAASWCLLESDVSSAPDIUiAUKiJl 
RALSVQQRGGPAWSGSLEWSRQSAGDRRRL 
GLSRQTAKSSWSRSRDRTCCCRRAWWILVPA 
ADRARRERFIMNEKWDTNSSENWHPIWNVN 
DTKHHLYSDINTTYVNYYLHQPQVAAIFIISYF 
LIFFLCMMGNTWCFIVMRNKHMHTVTNLFI 
LNLAISDLLVGIFCMPITLLDN11AGWPFGNTM 
CKISGLVQGISVAASVFTLVAIAVDRFQCWY 
PFKPKLTIKTAFVIIMnWVLAITIMSPSAVMLH 
VOEEKYYRVRLNSQNKTSPVYWCREDWPNQ 
EMRKIYTTVLFANIYLAPLSLIVIMYGRIGISLF 
RAAWHTGRJOJQEQWHWSRKKQKIIKMLLI 
VALLFILSWLPLWTLMMLSDYADLSPNELQU 
NTYTYPFAHWLAFGNSS\ r NPIIYGFFNENFRRG 
FQEAFQLQLCQKRAKPMEAYALKAKSHVLIN 
TSNQLVQESTFQNPHGETLLYRKSAEKPQQE 
LVMEELKETTNSSEI 


744 


2094 


A 


5966 


149 


327 


SHVCVSHYAGSSGCPAGAGAGAVAJLUlbAVA 
1 YDYOGGRLGVARGAWYKfEAPDIRQGDM 


745 


2095 


A 


5970 


413 


856 


GAPHTDWAWAPTPMSGLGSGRGRgoi LAbS 
PLSLPLLLAGVTG1LATELFDQMARPAACMV 
CGALMWIMLILVGLGFPRMEALSHFLYVPFL 
GVCVCGAIYTGLFLPETKGKTFQEISKELHRL 
NFPRRAQGPTWRSLEV7QSTEL , 
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SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 

hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 
beginning 

niiplfOtiH^ 

ilUvlWULlUb 

location 
correspond i 
ng to first 
amino acid 
residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A= Alanine C=Cysteme, 
D=Aspartic Acid, E^Olutamic Acid, 
F=Phenylalanine, G=Glycine, H=Histidinc, 
I«Isoleucinc, K=Lysrac, L=Leucine, 
M=Mcthioninc, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, V=possible 
nucleotide insertion 


746 


2096 


A 


5971 


3 


1343 


" AQTARRnGLELDTEGHRLFVAFSGCIVYLPLS 
RCARHGACQRSCLASQDPYCGWHSSRGCVD1 
RGSGGTDVDQAGNQESMEHGDCQDGATGSQ 
SGPGDSAYGVRRDLPPASASRSVPEPLLLASV 
AAAFALGASVSGLLVSCACRRAHRRRGKDIE 
TPGLPRPLSLRSLARLHGGGPEPPPPSKDGDA 
VQTPQLYTTFLPPPEGVPPPELACLPTPESTPE 
LFVKHLRAAGDPWEWNQNRNNAKEGPGRSR 
GGHAAGGPAPRVLVRPPPPGCPGQAVEVTTL 
EELLRYLHGPQPPRKGAEPPAPLTSRALPPEP 
APAIXGGPSPRPHECASPLRLDVPPEGRCASA 
P ARP ALS APAPRLG VGGGRRLPFSGHRAPP AL 
LTRVPSGGPSRYSGGPGKHLLYLGRPEGYRG 
RALKRVDVEKPQLSLKPPLVGPSSRQAVPNG 
r,RFNF 


747 


2097 


A 


5998 


2 


754 


DHASLPCSWNHRFDVETRHVFIGDHSGQV'ri 

LKLEQENCTLVTTFRGHTGGVTALCWDPVQ 

RVLFSGSSDHSVIMWDIGGRKGTAIELQGHN 

DRVQALSYAQHTRQLISCGGDGGIVVWNMD 

VERQETTEWLDSDSCQKCDQPFFWNFKQMW 

DSKKIGLRQHHCRKCGKAVCGKCSSKRSS1PL 

MGFEFE VRVCDSCHEAITDEERAPTATFHD SK 

HN1VHVHFDATRGWLLTSGTDKVIKXWDMT 

PWS 


748 


2098 


A 


6001 


2 




AMVFGGWPYVPQYRDIRRTQNADGFSTYV 

CLVLLVANILRILFWFGRRFESPLLWQSAIMIL 

TMLLMLKLCTEVRVANELNARRRSFTAADS 

KDEEVKVAPRRSFLDFDPHHFWQWSSFSDYV 

QCVLAFTGVAGYITYLSIDSALFVETLGFLAV 

LTEAMLGVPQLYRNHRHQSTEGMSIKMVLM 

WTS GDAFKTAYFLLKGAPLQFS VCGLLQ VL V 

DLAILGQAYAFARHPQKPAPHAVHPTGTKAL 


749 


2099 


A 


6002 


2 


447 


HGRPDRSELVRMHILEETFAEPSLQATQMKLK 
RARLADDLNEKIAQRPGPMELVEKNILPVDSS 
VKEAUGVGKEDYPHTQGDFSFDEDSSDALSP 
IXJPASQESQGSAASPSEPKVSESPSPVTTNTP 
AQFASVSPTVPEFLKTPPTAD 


750 


2100 


A 


6004 


2 


427 


LLTQAMLVLPHRPQWFTPGPRLQAQGPCQEU 

WRWELRLRNYVPEDEDLNKRRVPQAKPDAV 

QEKVKEQLEAAKPEPVIEEVDLAKLAPRKPD 

WDLKRDVAKKLEKLLKRTQRAIAELIRERLK 

GQEDSLDSAVDAATEHKTC 


751 


2101 


A 


6007 


33 


1280 


TDQAKVDNQPEKLVRSAEDVSTVPTQPDNFF 

SHPDKLKRMSKSVPAFLQDESDDRETDTASE 

SSYQLSRHKKSPSSLTNLSSSSGMTSLSSVSGS 

VMSVYSGDFGNLEVKGNIQFAIEYVESLKEL 

HVFVAQCKDLAAADVKKQRSDPYVKAYLLP 

DKGKMGKKKTLVVKKTLNPVYNEILRYKIEK 

QILKTQKLNLSIWHRDTFKRNSFLGEVELDLE 

T\VT)WDNKQNKQLRWYPLKRKTAPVALEAE 

mGEMKLALQYVPEPVPGKKLPTTGEVHIVA/ 

KECLDLPLLRGSHLNSF\TCCTILPDTSRKSRQ 

VrXTVWDHYKLTNQFLGGLRIGFGTGKSYGT 
EVDWMDSTSEEVALWEKMVNSPNTWIEATL 
PL RMLLI AKISK , 


752 


2102 


A 


6028 


108 


1283 

i 
i 


"'KEIFSPFELISVKPLCLLLGVTCSQSMAFEELL 
SQVGGLGRFQMLHLVFILPSLMLLIPH1LLENF 
AAAIPGHRCWVHMLDNNTGSGNErGILSEDA 
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SEQID S 
NO: of 1 
nucl- I 
cotide ; 
scq- ' 
uence 


>EQ1D f 
sIO: of \ 
peptide 
;eq- 
jence 


vlet 5 
lod I 
i 
1 
( 
< 


>EQ 1 
DNO: 1 
n 

JSSN 
39/496 
*14 


Predicted 

DCginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A= Alanine OCysteine, 
D-Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, G=G)ycine, H=Histidine, 
l=lso leucine, K=Lysine, L=Uucine, 
M=Methionine, N^Asparagine, P=Proline, 
Q=01utamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y-Tyrosine, X-Unknown, *=Stop codon, 
/^possible nucleotide deletion, V=possible 
nucleotide insertion 














LLRlSrPLDS^RPEKCRRFVHPQWQLLHLNU 

TIHSTSEADTEPCVDGWVYDQSYFPSTIVTKW 

ni VPHYOSI KSWOFLLLTGMLVGGIIGGHV 

SDRFGRRFILRWGLLQLAITDTCAAFAPTFPV 

YC VLRFLAGFS SMUISNNSLPITEWIRPNSKAL 

W1LSSGALNIGQULGGLAYWRDWQTLHW 

ASVPFFVFFLLSRWLVESARWLnTNKLDEGL 

KALRKVARTNGIKNAEETLNIEWRSTMQEE 

LDAAQTKTTVWDLFRNPSMRKRICILVFLRK 

KNLKEKA 


753 


2103 


A 


6043 


1 


1470 


" DSFESILRLIFElHHSGEKGDIVVFLACEQULbK 
VCETVYQGSNLNPDLGELVWPLYPKEKCSL 
FKPLDETEKRCQVYQRRWLTTSSGEFLIWSN 
SVRFVIDVGVERRKVYNPR1RANSLVMQP1SQ 
SQAEIRKQILGSSSSGKFFCLYTEEFASKDMTP 
LKPAEMQEANLTSMVLFMKRIDIAGLGHCDF 
MNRP APESLMQ ALEDLDYL AALDNDGNLSE 
FGIIMSEFPLDPQLSKSILASCEFDCVDEVLTIA 

a » /ut a oxircciTUPUnAPF A AT TCWKTFLHPE 

GDHFTLISIYKAYQDTTLNSSSEYCVEKWCRD 
YFLN CS ALRMAD V1RAELLEIIKRIELP Y AEP A 
FGSKENTLNIKJCALLSGYFMQ1ARDVDGSGN 
YLMLTHKQVAQLHPLSGYSITKKMPEWVLF 
T T-t/ i?c t c cxrMVT R lT^FT^PFI FMOLVPOYYFSNL 
PPSESKX)ILWVVDHLSPVSTMNKEQQMCET 

CPETEQRCTLQ 


754 


2104 


A 


6055 


2 


394 


YYALHHWPFPDLLCQTTGAIFQMNMYGSCU- 

LMLINVDRYAAIVHPLRLRHLRRPRVARLLC 

LGVWALILVFAVPAARVHRPSRCRYRDLEVR 

LCFESFSDELWKGRLLPLVLLAEALGFLLPLA 

A vwss 


755 


2105 


A 


6059 

! 


3 

i 


1795 


"LGLGSGTLLSVSEYKKKYREHVLQLHAKVKt 
RNARSVKITTCRFTKLLIAPESAAPEEALGPAEE 
PEPGRARRSDTHTFNRLFRRDEEGRRPLTWL 
QGPAGIGK^rMAAKJKJLYDWAAGKLYQGQVD 
F AFFMPCGELLERPGTRSLADLILDQCPDRG A 
PVPQMLAQPQRLLFELEXjADELPALGGPEAAP 
CTDPFEAASGARVLGGLLSKALLPTALLLVTT 
RAAAPGRLQGRLCSPQCAEVRGFSDKDKKK 
YFYKFFRDERRAERAYRFVKENETLFALCFV 

pfvcwivctvlrqqlelgrdlsrtsktttsvy 
llfitsvlssapvadgprlqgdlrnlcrlare 

GVLGRRAQFAEKELEQLELRGSKVQTLFLSK 

l/pi pcvt ETEVTYOFIDOSFQEFLAALSYLLE 

DGGVPRTAAGGVGTLLRGDAQPHSHLVLTT 

RFLFGLLSAERMRDIERKFGCMVSERVKQEA 

LRWVQGQGQGCPGVAPEVTEGAKGLEDTEE 

PEEEEEGEEPNYPLELLYCLYETQEDAFVRQA 

LCRFPELALQRVRFCRMDVAVLSYCVRCCPA 

GQALRLISCRLVAAQEKKKKSLGKRLQASLG 

GG 


/DO 




A 


6060 


12 


436 


" SGRPTRPAKn'GQGMGRFMLTLVCgCibiMM^ 
ART>LIMNNLTELQF<jLFHHLRFLEbXKLSU^ H 
LSI IIPGQ AFS GL YSLKILMLHNNQLGGIPAQA 
L WELPSLQSLRLD ANLISL VPERSFEGLS SLRH 

LWLDDNALTEIPS 


757 


2107 


A 


6063 


54 


419 


~" ITPLGLGAADMCAFPWLLLLLLLQEGSQRRL 
WRWCGSEEWAVLQESISLPLEIPPDEEVENII 
W S SHK SL ATVVPGKEGHPATIMVTNPHYQG 
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SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ ! 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspond] 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A-Alanine C=Cysteine, 
D=Aspartic Acid, E=GIutamic Acid, 
F=Phenylalanine J OGlycine, H-Histidinc, 
t— i^lmcine K=Lvsine. L— Leucine, 
M=Methioninc, N=Asparagine, P^Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X=Unicnown, *=Stop codon, 
/possible nucleotide deletion, \-possible 
nucleotide insertion 

OILTMLLRSLQQPSASWPRDCSSSCSW 


758 


2108 


A 


6066 


125 


438 


IGISCPATIFVPMFSHSLIGIGEEYQLPYYNMV 
PSDPSYEDMREVVCVKRLRPIVSNRWNSDEC 
LRAVLKLMSECWAHNPASRLTALRIKKTLAK 

MVESQDVK1 


759 


2109 


A 


6072 


3 


650 


PGRRFRPAALEERAMEKLREKVPFQNRGKUT 

LSS1IPNNSDTRKATETTSLSSKPEYVNPDFRW 

SKJDPSSKSGNLLETSEVGWTSNPEELDPIRLA 

LLGKSGLSCQVGSATSHPVSCQEPIDEDQRISP 

KDKSTAGREFSGQVSHQTTSENQCTPIPSSTV 

HSSVADMQNMPAAVHALLTQPSLSAAPFAQ 

RYLGTLPSTGSTTLPQCHAGNATVW 


760 


2110 


A 


6077 


3 


/ -?V 


PLRLTLMEEVLLLGLKDREGYTSFWNDCISSU 

LRGCML1ELPLRGRLQLEACGMRRKSLLTRK 

VICKSDAPTGDVLLDEALKHVKETQPPETVQ 

NWIELLSGETWNPLKLHYQLRNVRERLAKNL 

VEKGVLTTEKQNFLLFDMTTHPLTNNNIKQR 

LIKKVQEAVLDKWVNDPHRMDRRLLALIYL 

AHASDVLENAFAPLLDEQYDLATKRVRQLLD 

LDPEVECLKANTNEVLWAWAAFTK 


761 


2111 


A 


6078 


833 


390 


IVSFHLSGFKKFVRPFSFLSVHGLQVDEYHSV 
HQKLSADMADHSNLIRSLLVGAEDARLMRD 
MKTMKSRYMELYDLNRDLLNGYKIRWNNH 
TELLGNLKAVNQAIQRAGRLRVGKPKNQVIT 
ACRD AIRSNNINTLFKIMRVGTAS S , 


762 


2112 


A 


6079 


2 


2686 


KKAITCGEKEKQDLIKSLAMLKDGFRTDRCiS 

HSDLWSSSSSLESSSFPLPKQYLDVSSQTDISG 

SFGTNSNNQLAEKVRLRLRYEEAKRRIANLKI 

QLAKLDSEAWPGVLDSERDRLILINEKEELLK 

EMRF1SPRKWTQGEVEQLEMARKRLEKDLQ 

AARDTQ S KALTERLKLN SKRNQL VRELEE AT 

RQVATLHSQLKSLSSSMQSLSSGSSPGSLTSSR 

GSLVASSLDSSTSASFTDLYYDPFEQLDSELQ 

SKVEFLLLEGATGFRPSGCITTIHEDEVAKTQ 

KAEGGGRLQALRSLSGTPKSMTSLSPRSSLSS 

P SPPCSPLMADPLLAGD AFLNSLEFEDPEL S A 

TLCELSLGNSAQERYRLEEPGTEGKQLGQAV 

NTAQGCGLKVACVSAAVSDESVAGDSGVYE 

ASVQRLGASEAAAFDSDESEAVGATRIQIALK 

YDEKNKQFAILnQLSNLSALLQQQDQKVNIR 

VAVLPCSESTTCLFRTRPLDASDTLVFNEVFW 

VSMSYPALHQKTLRVDVCTTDRSHLEECLGG 

AQISLAEVCRSGERSTRWYNLLSYKYLKKQS 

RELKPVGVMAPASGPASTDAVSALLEQTAVE 

LEKRQEGRSSTQTLEDSWRYEETSENEAVAE 

EEEEEVEEEEGEEDVFTEKASPDMDGYPALK 

VDKETNTETPAPSPTVVRPKDRRVGTPSQGPF 

LRGSTirRSKTFSPGPQSQYVCRLNRSDSDSST 

LSKXPPFVRNSLERRSVRMKRPSPPPQPSSVK 

SLRSERLIRTSLDLELDLQATRTWHSQLTQEIS 

VLKELKEQLEQAKSHGEKELPQWLREDERFR 

LXLRMLEKKMDRAEHN1GELQTDKMMRAAA 

imvURl RGOSCKEPPEVOSFREKMAFFTRPR 

MN1PALSADDV 


763 


2113 


A 


6082 


3 


1558 


" "PHPIRFSKLCVSFhNQEYNQFCVIEEASKANfc 
VLENLTQGKMCLVPGKTRKLLFKFVAKTED 
VGKKIEITSVDLALGNETGRCVVXKWQGGGG 
DAASSQEALQAARSFKRRPKLPDNEVHWGSII 
IQASTMIISRVPN1SVHLLHEPPALTNEMYCLV 
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SEQ ID : 
NO: of 1 
nucl- 1 
eotide 
seq- 
uence 


SEQ ID 
MO: of 
peptide 
seq- 
uence 


Viet 
hod 


SEQ 
ID NO: 
n 

L'SSN 
09/496 
914 


Predicted 1 
beginning 

nnrlmtifie 

Ll U l< lt>\J LiN-l^* 

location 
correspondi 
ng to first 
amino acid 
residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A^Alamne C=Cysteine, 
D=Aspartic Acid, E-Glutarnic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V-V aline, W=Tryptophan, 
Y=Tyrosine, X-Unknown, *=Stop codon, 
/=possiblc nucleotide deletion, \=possible 
nucleotide insertion 














VTVQSHEKTQIRDVKLTAGL1CPGQDANL1 QK 

THVTLHGTELCDESYPALLTDIPVGDLHPGEQ 

LEKMLYVRCGTVGSRMFLVYVSYLINTTVEE 

KEIVCKCHKDETVTIETVFPFDVAVKFVSTKF 

EHLERWADIPr^LMTDLLSASPWALTIVSSE 

LHLAPSMTTVDQLESQVDNVILQTGESASECF 

CLQCPSLGN1EGGVATGHYIISWKRTSAMENI 

PIITTViTLPHVlVENIPLHVNADLPSFGRVRES 

LP VK YHLQNKTDL VQDVEI S VEPSD AFMF SG 

LKQIRLRILPGTEQEMLYNFYPLMAGYQQLPS 

LNINLLRFPNFTN QLLRRFIPTS IF VKPQGRLM 

DDTS1AAA 


764 


2114 


A 


6093 


] 


1422 


AAADLANSNAGAAVGRKAGPRSPPSAPAFA1' 

PPPAPAPPTLGNNHQESPGWRCCRPTLRERN 

ALMFNNELMADVHFVVGPPGATRTVPAHKY 

VLAVGSSVFYAMFYGDLAEVKSEIHIPDVEPA 

AFLILLKYMYSDE1DLEADTVLATLYAAKKYI 

VPALAKACVNFLETSLEAKNACVLLSQSRLF 

EEPELTQRCWEVIDAQAEMALRSEGFCEIDR 

QTLEIIVTREALNTKEAWFEAVLNWAEAEC 

KRQGLPITTRNKRHVLGRALYLVRIPTMTLEE 

FANGAAQSDILTLEETHSIFLWYTATNKPRLD 

FPLTKRKGLAPQRCHRFQSSAYRSNQWRYRG 

RCDS1QFAVDRRVFIAGLGLYGSSSGKAEYSV 

KIELKRLG VVLAQNLTKFMSDG SSNTFPV WF 

EHPVQVEQDTFYTASAVLDGSELSYFGQEGM 

TEVQCGKVAFQFQCSSDSTNGTGVQGGQIPE 

TTPYA ^ 


765 


2115 


A 


6099 


1 


1150 


SGFTHYATYDFIVKGSCFCNVHADQCIPVHU^ 

RPVKAPGTFHMVHGKCMCKHNTAGSHCQH 

CAPLYNDRPWEAADGKTGAPNECRTCKCNG 

HADTCHFDVNVWEASGNRSGGVCDDCQHN 

TEGQYCQRCKPGFYRDLRRPFSAPDACKPCS 

CHPVGSAVLPANSVTFCDPSNGDCPCKPGVA 

GRRCDRCMVGYWGFGDYGCRPCDCAGSCD 

PITGDCISSHTD1DWYHEVPDFRPVHNKSEPP 

WEWEDAQGFSALLHSGKCECKEQTLGNAKA 

FCGMKYSYVLKIKILSAHDKGTHVEVNVKIK 

KVLKLSTKLKJFRGKRTLYPESWTDRGCTCPIL 

NPGLEYLVAGHEDIRTGKLIVNMKSFVQHWK 

PSLGRKVMD1LKRECK 


766 


2116 


A 


6103 


2 


384 


" MTAAATATVLKEGVLEKRSGGLLQLWKKKK 
CVLTERGLQLFEAKGTGGRPKELSFARIKAVE 
CVESTGRHIYFTLVTEGGGEIDFRCPLEDPGW 
NAQITLGLVKFKNQQAIQTVRARQSLGTGTL 


767 


2117 


A 


6106 


I 


542 


" SGSSHASDGSGFQELRlCSEDQTPLlAGMCbLf 
MARYYI1KYADQKALYTRDGQLLVGDPVAD 
NCCAEKICTLPNRGLDRTKVPIFLGIQGGSRC 
LACVETEEGPSLQLEDVNIEELYKGGEEATRF 
TFFQSSSGSAFRLEAAAWPGWFLCGPAEPQQ 
PVOLTKESEPSARTKFYFEQSW 


768 


2118 


A 


6109 


3 


292 


- P1T oavt OT SSOEARYKAFGTCVSHIGAILAF 
YTPSV1SSVMHRVARCAAPHVHILLANFYLLF 
PPMVNPIIYG VKTKQIRDSLG S IPEKGCVNRE 


769 


2119 


A 


6110 


1 


711 


'■-RHEPSCSNGVASTKSKQNHSKYPAPSSSSSSS 
SSSSSSSPSSVNYSESNSTDSTKSQHHSSTSNQ 
ETSDSEMEMEAEHYPNGVLGSMSTRIVNGAY 
KHEDLQTDESSMDDRHPRRQLCGGNQAATE 
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SEQID 1 
NO : of 1 
nucl- 
eotide 
seq- 
uence 


SEQID i 
^0: of 1 
peptide 
seq- 
uence 


Viet i 
lod ] 


SEQ 1 ] 
[DNO: j 
in 

USSN 
09/496 
914 


Predicted 1 

beginning 

nucleotide 

location 

correspond! 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last ammo 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine C-Cysteine, 
D=Aspartic Acid, E«Glutamic Acid, 
F=Phcnylalaninc, OGlycine, H=Histidine, 
l=Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R^Arginine, S=Serine, 
T=Threonine, V-Valine, W=Tryptophan, 
Y=Tyrosine, X-=Unknown, *-Stop codon, 
/^possible nucleotide deletion, V=possible 
nucleotide insertion 














RIELFGRELQ ALSEQLGRE Y GKNL AHTEMLQD 

AFSLLAYSDPWSCPVGQQLDPIQREPVCAAL 

NSAILESQNLPKQPPLMLALGQASECLRLMA 

RAGLGSCSFARVDDYLH 


! 

770 


2120 


A 


6125 


2 


570 


YFGLNLHVQHLGNNVFLLQTLFGAV1LLANC 
VAPWALKYMNRRASQMLLMFLLA1CLLA11F 
VPQEMQMLREVLATLGLGASALANTLAFAH 
GNE VTPTIIRARAMG IN ATF AN 1AG AL APLMM 
ILSVYSPPLPWnYGVFPnSGFAFLLLPETRNK 
PLFDT1 QDEKNERKDPREPKQEDPRVEVTQF 


771 


2121 


A 


6126 


909 


353 


RSF VLDTAS AICNYN AH YKNHPKYWCRU Y F 

RDYCNIIAFSPNSTNHVALRDTGNQLIVTMSC 

LTKEDTGWYWCGIQRDFARDDMDFTELIVT 

DDKGTLANDFWSGKDLSGNKTRSCKAPKW 

RKADRSRTSILIICILITGLGI1SVISHLTKRRRS 

ORNRRVGOTLKPFSRVLTPKEMAPTEQM 


772 


2122 


A 


6148 


7 


810 


FVLGILALSHTlSPFMNKFFPASFPNRQYQLLh 

TQGSGENKEEIINYEFDTKDLVCLGLSSIVGV 

WTLLRKHWIANNLFGLAFSLNGVELLHLNN 

VSTGCaLGGLFrTDWWVFGTNVMVTVAKS 

FEAPIKLVFPQDLLEKGLEANNFAMLGLGDV 

V1PGIFIALLLRFDISLKKNTHTYFYTSFAAYIF 

GLGLTIFIMHIFKHAQPALLYLVPACIGFPVLV 

ALAKGEVTEMFSYEESNPKDPAAVTESKEGT 

EASASKGLEKKEK 


773 


2123 


A 


6161 


3 


1088 


CQPMLVTRKNHPKLLLRRTESV AEKMLTN W 

FTFLLYKFLKESAGEPLFMLYCAIKHQMEKG 

PIDAITGEARYSLSEDKLIRHLIDYKTLTLNCV 

NPENENAPEVPVKGLDCDTGTQAKEKLLDA 

AYKGVPYSQRPKAADMDLEWRQGRMARIIL 

QDEDVTTKIDNDWKRLNTLAHYQVTDGSSV 

ALVPKQTSAYN1SNSSTFTKSLSRYESMLRTA 

SSPDSLRSRTP^TPDLESGTKLWHLVKNHDH 

LDQREGDRGSKMVSEIYLTRLLATKGTLQKF 

VDDLFET1FSTAHRGSALPLAIKYMFDFLDEQ 

ADKHQIHDADVRHTWKSNCLPLRFWVNVIK 

NPOFVFDIHKNSITDACLSVV 


774 




A 

r\ 


6163 


860 


125 


KTAVTCKJINLNPVFNETLRYSVPQAELQUKVL 

SLSVWHRESLGRNIFLGEVEVPLDTWDWGSE 

PTWLPLQPRVPPSPDDLPSRGLLALSLKYVPA 

GSEGAGLPPSGELHFWVKEARDLLPLRAGSL 

DTYVQCFVLPDDSRASRQRTRWRRSLSPVF 

NHTMVYDGFGPADLRQACAELSLWDHGALA 

NRQLGGTRl^LGTGSSYGLQVPWMDSTPhbK 

OLWOALLEQPCEWVDGLLPLRTNLAPRT 


775 


2125 


A 


6191 


2 


392 


" ARGIGSLGRDHSGSGGG'l GMAGAWVRKAAU 
YVRSKDFRDYLMSTHFWGPVANWGLPIAA1T 
DMKVKSPEIISRRMTFAL* CYSLTFVRFAHYVQ 
\PWNWLMLGCHTAVDFDQLISSMPCrSHGMT 

ASASAL 


776 


2126 


A 


6217 


1 


827 


~ FRGYWGVREAFTDASWSGGLGPG^UMKli 
RQKHAKKmGFFRNNFGVREPYQILLDGTFC 
QAALRGRIQLREQLPRYLMGETQLCTTRCVL 
KELETLGKDLYGAKL1AQKCQVRNCPHFKNA 
VSGSECLLSMVEEGNPHHYFVATQDQNLSVK 
VKK3CPGWLMFnQOTNm>DK^SPKTIAFV^ 
VESGXRLS QCMRKKVSN1SKRNRV* * KTLNRG 
RRKXRKXISGPNPLSCLKJCXXKAPDTQSSASE 
KXRKRKRIRNRSNPKVLSEKQNAEGE 
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SEQ ID 5 
NO: of 1 
nucl- I 
cotide i 
seq- 
uence 


SEQ ID I 
sfO: of 1 
peptide 
>eq- 
jence 


Viet < 


SEQ I 
DNO: \ 
n i 
USSN 1 
09/496 
914 


Predicted J 
beginning i 
lucleotide 
o cation 
correspond i 
ng to first 
amino acid 
residue of 
peptide j 
sequence 


Predicted end 
lucleotide 
o cation 
corresponding 
to last amino 
acid residue 

jl pepuut. 

sequence 


Amino acid sequence (A= Alanine C=Cysteine, 
>Aspartic Acid, E=01utamic Acid, 
^Phenylalanine, (^Glycine, H-Histidine, 
Hsoleucine, K-Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q^Glutamine, R=Arginme, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y-Tyrosine, X=Unknown, *«Stop codon, 
/=possible nucleotide deletion, V=possible 
nucleotide insertion 


111 


2127 


A 


6236 


1038 


1402 


YYQISSLPSIVGNGIFLWLLlClFLAKQCiUbKL- 
FQPFGRPRGGGHLRSG VLGQPGQHGETP/SFF 
YNSK1SPALWGPPVTPSALGGEAGKSL*PRRQ 
RFORGGIAPLPSRVRGRAKLFLKKK 


778 


2128 


A 


6237 


422 


913 


ASFFHHHRGAFLLLLAIPGS*GQDQSLIHWSN 

AVSNAD\LLDLK\N*LDH\LEEKMPLVEVKVVP 

PQVL\SEPN*RSGGCFSAPSFEVPPWTGEVKP/ 

SPQRDGGALG\QGPLGIPSDSILALLKKQT*RA 

LLNWPLGSLRRSSCFGGQDGQDLKPRSGLGC 

N^FRYRR 


779 


2129 


A 


6249 


420 


36 


ARAPSPSFSVRDVELSDPARERGEMP V A V ur 
YGQSQPSCFDRVKMGFVMGCAVGMAAGAL 
FGTFSCLSS1LVSSSG/SGMRGRELMGGIGKTM 
MQSGGTFGTFMA1GMGIRC*PWLPTTSVPSH 

n<5HPMY 


780 


2130 


A 


6263 


415 


1380 


RlMRMCDRGIQMLril VUAFAAFSLM1 LAVLr 

TDYWLYSRGVCRTKSTSDNETSRKNEEVMT 

HSGLWRTCCLEGAFRGVCKKIDHFPEDADYE 

rvnTAFVT T RAVRASSVFPILSVTLLFFGGLCV 

AASEFHRSRHNVILSAGIFFVSAGLSNUGirVYI 

S\ANAGRTPGQR\DSKKSYSYGWSF/YFSGAFS 

FDGR/IIC*GVGLPWH1YIEKHQQLRAKSHSEF 

LKKSTF ARLPPYRYUFRRRSSSRSTEPRSRDLS 

PISKGFHTIPSTDISMFTLSRDPSKJTMGTLLNS 

nRnHAFLOFHNSTPKEFKESLHNNPANRRTT 

PY 


781 


2131 


A 


6274 


832 


318 


RJIK VKDLKQTL AlKTAYPRCKCLVEMDgil- M 
LO VKQKQLACLCTW Q ARDPDCPPSTKWL/L 
VGPGMGCMVALFQDSIAWSNKSMPSSLSAIS 
O SPCQ VQ APEGPSSFHLPTLSFTTCLSWQGGD 
LEFLGDLKGCSELKNFQEL1TQSALVHPKADV 
WWYCGRPLLGTLPSN 


782 


2132 


A 


6281 


1324 


393 


WlSLPSSLLCRKNGSSAEDDRR\GEPSAbbAtLr 
EREDWGIGSA*SVGAVSKVFSARF*RTYPS\E 
nFFFVTHOKSSSSDSNSEEHRKKKTSRSRNK 
KKRKNKSSKRKHRKYSDSDSNSESDTNSDSD 

DDKKJWKAKJKXKKKKKHKT 

ESSDSSCKDSEEDLSEATWMEQPNVADTMDL 

IGPEAPDHTSQDEKPLKYGHA1XPGEGAAMA 

EYVKAGKRIPRRGE1GLTSEEIGSFECSGYVM 

SGSRHRRMEAVRLRKENQIYSADEKRALASF 

NOFERRKRESK1LASFREMVHKKTKGKDDK 


783 


2133 


A ■ 


6305 


201 


1032 


' WDDYPQGALRRREAAEGLHFLGPPGRVRUQ 
LRGITGPAWYCHSPSHSLLSAFCHLPTPSRCP 
AMARPP VPG SVWPN WHES/RRGQG VPGLHS 
AOEPP AG VW AA* AAS AAAAVLSIDTAS YKIF V 
SGKSGVGKTALVAKLAGLEVPWHHETTGIQ 
7TVWWPAKLQASSRVVMFRFEFWDCGESA 
LKKFDHMLLACMENTDAFLFLFSFTDRASFE 
DLPGQLARIAGEAPGVVRMVIGSKPDQYMHT 
DVPERDLTAFRQAWELPLLRVKSVPGRRLG 




2134 

1 


A 


6308 


86 


96 

1 


J -GSSPDPASl^ 

^/.f-p a r-incr* A HPDPQDA A P A VF AFGPGSSOA 

PRKPEGAQARTAQSGALRDVSEELSRQLEDIL 
STYCVDNNQGGPGEDGAQGEPAEPEDAEKSR 
TYVARNGEPEPTPVVNGEKEPSKGDPNTEEIR 
OSDEVGDRDHRRPQEKKKAKGLGKEITLLM 
QTLNTL STPEEKL AALCKKY AELLEEHRNSQ 
: KQMKLLQKKQSQLVQEKDHLRGEHSKAVLA 
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seq- 
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Met 
hod 


SEQ 
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in 
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Predicted 
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nucleotide 
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correspond i 
ng to first 
amino acid 
residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last ammo 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, 
D=Aspartic Acid, EKjlutamic Acid, 
F*Phenvlalanine, G^Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proiine, 
Q=Glutamine, R=Arginine, S=Serine, 
^Threonine, V=VaJine, W=Tryptophan, 
Y«=Tyrosine, X=Unknown, ^Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 














RSKLESLCRELQRHNRSLKEEGVQRAREEEE 

QENMELAERLKKLIEQYELREEHIDKVFKHK 

DLQQQLVDAKLQQAQEMLKEAEERHQREKD 

FLLKEAVESQRMCELMKQQETHLKQQLALY 

TEKFEEFQKTLSKSSEVFTTFKQEMEKMTKXJ 

KJCLEKETTMYRSRWESSNKALLEMAEEKTV 

RDKELEGLQVK1QRLEKLCRALQT/GAQ*PVR 

OQKWOorlK 1 oAVlur j 


785 


2135 


A 


6319 


1493 


889 


SPQGPLLRSVSPVSAGASSVTPGGAQPGVT1T 

PPSLVAVAPAPGSAAGPAAGWQ*HAGCR/WT 

KLP WS WGMRPMKIFFSEEYRSI STRISHDAL* 

EKCTQPAKPLSMIRNTGSSVSPG/PLVKWNWT 

RRJbrJKNoO I KV V ooui^vjjvio^ivi i orbunuov/ij 

QDLPLVHVDVGWQPPLGrTVGLRPGLLPLHD 

TTPCQKLVVDDLDWA 


786 


2136 


A 


6320 


551 


135 


RWLPVAECDSSCVGCTGEGPGNCKECISGYA 
REHGQCADVDECSLAEKTCVRKNENCYNTP 
GS YVCVCPDGFEET/RRCLC AAGRG* SHRRRK 
PDTAALPRRPVMCRTYPLNYSEGCPVENVAL 
RMPSPAVDSGGERLPAL 


787 


2137 


A 


6330 


1693 


227 


DYVLTAELHRQRSPGVSFGLSVFNLMNA1MG 
SGILGLAYVMANTGVFGFSFIXLTVALLASYS 
VHLLLSMCIQTAYLGP*TNYFMVLPAH*LTCL 
PLIEFLQSL*NSL\*AVTSYEDLGLFAFGLPGKL 
WAGTIIIQNIGAMSSYLLHKTELPAAIAEFLT 
GDYSRYWYLDGQTLLI1ICVGIVFPLALLPKJG 
FLGYTSSLSFFFMMFFALVVTIKKWSIPCPLTL 

TMAFSFLCHTSILPIYCELQSPSKJCRMQNVTN 

TAIALSFLIYF1SALFGYLTFYD/GTTKAQRGE 

VTCHRIKDKVESELLKG* ** IP* SHDVWMTiV 

KLC I LF A VLL\TVPL IHFP ARKA VTMMFFSNFP 

FSWIRHFLITLALNinVLLAIYVPDIRNVFGVV 

GASTSTCLIFiFPGLFYLKLSREDFLSWKKLGV 

GCFC/LLSFKTSILRNSLSVY1ILPASRKSIYFKI 


788 


2138 


A 


6351 


1 


6622 


PRSLCFSLWAEAAVLADGGLRRRRRLLRGTM 

SASFVPNGASLEDCHCNLFCLADLTGIKWKK 

YVWQGPTSAPILFPVTEEDPILSSFSRCLKADV 

LGWWRRDQRPERRE\L*IFWGGEDP\VLLTLF 

TMTYQKiCKMECGRMDFPMNAVLCF SKA VH 

NLLERCLMNRNFVRIGKWFVKPYEKDEKPIN 

KSEHLSCSFTFFLHGDSNVCTSVEINQHQPVY 

LLSEEHITLAQQSNSPFQVILCPFGLNGTLTGQ 

AFKMSDSATKKLIGE WKQFYPI SCCLKEMSE 

EKQEDMDWEDDSLAAVEVLVAGVRMIYPAC 

FVLVPQSDIPTPSPVGSTHCSSSCLGVHQVPAS 

TRDPAMSSVTLTPPTSPEEVQTVDPQSVQKW 

VKFSSVSDGFNSDSTSHHGGKIPRKLANHW 

DRVWQECNMNRAQNKRKYSASSGGLCEEAT 

AAKVA S WDFVEATQRTN CSCLRHKNLKSRN 

AGQQGQAPSLGQQQQILPKHKTNEICQEKSEK 

PQKRPLTPFHHRVSVSDDVGMD\ADS\ASQRL 

V\I S AP\D S Q\ VRFSNIR\TND V AK\TPQMHGTE 

MANSPQPPPLSPVHPCDWDEGVTKTPSTPQS 

QHFYQMPTPDPLVPSKPMEDRIDSLSQSFPPQ 

YQEAVEPTVYVGTAVNLEEDEAN1AWKYYK 

FPKJOCDVEFLPPQLPSDKFKDDPVGPFGQESV 

TSVTELMVQCKKPLKVSDELVQQYQIKNQCL 

SA1ASDAEQEPKIDPYAFVEGDEEFLFPDKKD 
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D-Aspartic Acid, EOlutamic Acid, 
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Y=Tyrosine, X^Unknown, *=Stop codon, 
/=possible nucleotide deletion, V=possible 
nucleotide insertion 














" RQNSEREAGKKHKVEDGTSSVTVLSHEEDA 
MSLFSPSIKQDAPRPTSHARPPSTSLIYDSDLA 
VSYTDLDNLFNSDEDELTPGSKRSANGSDDK 
ASCKESKTGNLDPLSC1STADLHKMYPTPPSL 
EQHIMGFSPMNMNNKEYGSMDTTPGGTVLE 
GNSSSIGAQFKIEVDEGFCSPKPSEIKDFSYVY 
KPENCQCLVGCSMFAPLKTLPSQYLPLIKLPEE 
CIYRQSWTVGKLELLSSGPSMPFIKEGDGSNM 
DQEYGTAYTPQTHTSCGMPPSSAPPSNSGAGI 
LPSPSTPRFPTPRTPRTPRTPRGAGGPASAQGS 
VKYENSDLYSPASTPSTCRPLNSVEPATVPSIP 
EAHSLYVNLILSESVMNLFKDCNSDSCCICVC 
NMNIKGADVGVYIPDPTQEAQYRCTCGFSAV 
MNRKFGNNSGLFFEDELD11GRNTDCGKEAE 
KRFEALRATSAEHVNGGLKESEKLSDDLILLL 
QDQCTNLFSPFGAADQDPFPKSGVISNWVRV 
EERDCCNDCYLALEHGRQFMDNMSGGKVDE 
ALVKSSCLHPWSKRNDVSMQCSQDILRMLLS 
LQPVLQDAIQKKRTVRPWGVQGPLTWQQFH 
KMAGRGSYGTDESPEPLPIPTFLLGYDYDYLV 
LSPFALPYWERLMLEPYGSQRD1AYWLCPE 
NEALLNGAKSFFRDLTAIYESCRLGQHRPVSR 
LLTDGIMRVGSTASKKLSEKLVAEWFSQAAD 
GNNEAFSKLKL Y AQ VCRYDLGP YLASLPLD S 
SLLSQPNLVAPTSQSLITPPQMTNTGNANTPS 
ATLASAASSTMTVTSGVAISTSVATANSTLTT 
ASTSSSSSSNLNSGVSSNKLPSFPPFGSMNSNA 
AGSMSTQANTVQSGQLGGQQTSALQTAGISG 
ESSSLPTQPHPDVSESTMDRDKVGIPTOGDSH 
AVTYPPAIWYIIDPFTYENTDESTNSSSVWTL 
GLLRCFLEMVQTLPPHIKSTVSVQIIPCQYLLQ 
PVKHEDREIYPQHLKSLAFSAFTQCRRPLPTS 
TKVKTLTGFGPGLAMETALRSPDRPECIRLYA 
PPFILAPVKDKQTELGETFGEAGQKYNVLFV 
GYCLSHDQRWILASCTDLYGELLETCIINIDVP 
NRARRKKSSARKFGLQKLWEWCLGLVQMSS 
LPWRWIGRLGRIGHGELKDWSCLLSRRNLQ 
SLSKRLKDMCRMCGISAADSPSILSACLVAM 
EPQGSFVIMPDSVSTGSVFGRSTTLNMQTSQL 
NTPQDTSCTHILVFPTSASVQVASATYTTENL 
DLAFNPNNDGADGMGIFDLLDTGDDLDPDII 
NILPASPTGSPVHSPGSHYPHGGDAGKGQSTD 
RLLSTEPHEEVPNILQQPLALGYFVSTAKAGP 
LPDWFWSACPQAQYQCPLFLKASLHLHVPSV 
QSDELLHSKHSHPLDSNQTSDVLRFVLEQYN 
ALS WLTCDP ATQDRRSCLPIHFWLNQL YNFI 
MNML 


789 


2139 


A 


6359 


1 


2002 


TGTLTEDGLDVMGVVPLKGQAFLPLVPEPRR 

LPVGPLLRALATCHALSRLQDTPVGDPMDLK 

MVESTGWVLEEEPAADSAFGTQVLAVMRPP 

LWEPQLQAMEEPPVPVSVLHRFPFSSALQRM 

SVWAWPGATQPEAYVKGSPELVAGLCNPET 

VPTDFAQMLQSYTAAGYRWALASKPLPSVP 

SLEAAQQLTRDTVEGDLSLLGLLVMRNLLKP 

QTTPVIQALRRTRniAVNfVTGDNLQTAVTVA 

RGCGMVAPQEHLIIVHATHPERGQPASLEFLP 

MESPTAVNGVKDPDQAASYTVEPDPRSRHLA 

LSGPTFGIIVKHFPKLLPKVLVQGTVFARMAP 

EQKTELVCELQKLQYCVGMCGDGANDCGAL 

KAADVGISLSQAEASWSPFTSSMASIECVPM 
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Amino acid sequence (A=Alanine OCysteine, 
D-Aspartic Acid, EKjlutamic Acid, 
F=Phenylalanine, G=C!ycine, H=Histidine, 
l=Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=ProIme, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=V aline, W=Tryptophan, 
Y-Tyrosine, X^Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 














V1REGRCSLDTSFSVFKYMALYSLTQFISVL1L 
YTINTNLGDLQFLAIDLVnTTVAVLMSRTGP 
ALVLGRVRPPGALLSVPVLSSLLLQMVLVTG 
VQLGGYFLTLAQPWFVPLNRTVAAPDNLPNY 
ENTWFSLS SFQYLELAAAVSKG APFR\RPLTN 
NVPFLL AS AL* SS VL WLVLSPGLLHGPLALR 
NITDTGFKLLLVGLVTLNFVGGLHAGERARP 
VPPRLP APPPAQAG\SKKRFKQLERELAEQP W 
PPLPAGPLR 


790 


2140 


A 


6380 


76 


1059 


SSAGSARKLQVMALAARLWRLLPFRRUW 

GSRLPAGTSGSRGHCGPCRFRGFEVMGNPGT 

FKRGLLLSALSYLGFETYQVISQAAVVHATA 

K VEEILEQ AD YLYESGETEKL YQLLTQ YKESE 

DAELLWRLARASRDVAQLSRTSEEEKKLLVY 

EALEYAKRA/L/EKNESSFASHKWYAICLSDV 

GDYEGIKAK1ANAYIIKEHFEKAIELNPKDATS 

IHLMGIWCYTFAEMPWYQRRIA^NACLQLPP 

* FPP YEKALG\YFHRAEQVDPNFY SKNLLLLG 

KTYLKLHNKKLAAFWLMKAKDYPAHTEED 

KOIOTEAAQLLTSFSEKN 


791 


2141 


A 


6434 


3 


1460 


" lALLIVDGLAWDDQGGLALLHISPSKLIL*QL)b 
SGMS/YVMVRCTITRAFFKSLLCHICQYSIGPQ 
* VT\CPGQDACKE*KSTAN* GG*RE* * PQVLFF 
AFLSNPAVKFGRMSKKQRDSLYAEVQKHQQ 
RLQEQRQQQSGEAEALARVYSSSISNGLSNLN 
NETSGTYANGSVIDLPKSEGYYNWSGQPSP 
DQSGLDMT\GIKQIKQEPIYDLTSVPNLFTY\SS 
FNN\GQLAPGIT\MTEIDRIAQNI1K SHLETCQY 
TMEELHQLAWQTHTYEEIKAYQSKSREALW 
QQCAIQITHAIQYWEFAKJUTGFMELCQNDQ 
ILLLKSGCLEVVLVRMCRAFNPLNNTVLFEG 
KYGGMQMFKALGSDDLVNEAFDFAKNLCSL 
QLTEEEIALFSSAVLISPDRAWLIEPRKVQKLQ 
EKIYFALQHVIQKNHLDDETLAKLIAKIPTITA 
VCNLHGEKLQVFKQSHPEIVNTLFPPLYKELF 

NPDCATACK 


792 


2142 


A 


6440 


no 


781 

/ 0 1 


" SRGTFRCFCRDFFPCFSNMRLFLWNAVLTUbV 
TSLIGALIPEPEVKIEVLQKPFICHRKTKGGDL 
ML VHYEG YLEKDGSLFHSTHKHNN GQPI WFT 
LGILEALKGWGPG A* K/DMC VGEKRKLIIPP A 
LGYGKEGKGKIPPESTLIFN1DLLE1RNGPRSH 
ESFQEMDLNDDWKLSKDEVKAYLKKEFEKH 
GAWNESHHDALVEDIFDKEDEDKDGFISAR 
EFTYKHDEL 


793 


2143 


A 


6446 


3201 


152 


" PRLKRLVVTEEDGGARPEALGKIAPRTPABLG 
ARADQELVTALMCDLRRPAAGGMMDLAYV 
CEWEKWSKSTHCPSWLACAWSCRNLIAFTM 
DLRSDDQDLTRMIHILDTEHPWDLHSIPSEHH 
EAITOLEWDQSGFPGFLFSRWPTGQIKNCWS 
MGVSTLA\NSWE\SSVGSL\VEGGPHLWALS\ 
WLHuNGVKLALHVEKSGASSFGEKFSR\VKFS 
P\SLTLF\GGNAMEGWIAVTVSGLVTVSLLQ\P 
SGQVL\TST\ESLCRLRARVALADIAFTGGGN1 
m7AT A tv^qq A \QPVOFYTC VPVS VVSEKCRIDT 
DD^PSLFMRCTTOLNRKDKFPAITHLKFLARD 
MSEQVLLCASSQTSSIVECWSLRKEGLPVNNI 
FQQISPVVGDKQPTILK^TULSATN'DLDRVSA 
V\ALPKLPISLTNTDLKVASDTQFYPGLGLAL 
AFHDGSVHIVHRLSLQTMAVFYSSAAPRPVD 
EPAMKRPRTAGPAVHLKAMQLSWTSLALVG 
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Amino acid sequence (A= Alanine 0=Cyste:ne, 
D=Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, GOlycine, H-Histidine, 
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/^possible nucleotide deletion, ^possible 
nucleotide insertion 


1 












~IDSHGKLSV\LRLSPSMGHPLEVGLALRHLLFL 
LEYCMVTGYDWWDILLHVQPSMVQSLVEKL 
HEEYTRQTAALQQVLSTRILAMKASLCKLSP 
CTVTRVCDYHTKLFLIA1SSTLKSLXRPHFLNT 
PDKSPGDRLTE1CTKTOVDIDKVN1INLKTEEF 
VLDMNTLQALQQLLQWVGDFVLYLLASLPN 
QPCPTSEPCPTSEPSPTSEPSPTSEPSSP»SLC\G 
SLLRPGHSFLRIX}TSLGMLRELM\ r VIRrWGLL 
KPSCLPVYTATSDTQDSMSLLFRLLTKLWICC 
RDEGPASEPDEALVDECCLLPSQLLIPSLDWL 
PASDGLVSRLQPKQPLRLQFGRAPTLPGSAAT 
LQLDGLARAPGQPKIDHLRRLHLGACPTEEC 
KACTRCGCVTMLKSPNRTTAVKQWEQRWIK 
NC/LVRWALVAGAPQLPLSPAAPQLLLSYPSA 
APEPGCCKSHRSPWTLLGAVNLSPPCRAVEG 
RGPDACVTSRASEEAPAFVQLGPQSTHHSPRT 
PRSLDHLHPEDRP 


794 


2144 


A 


6490 


418 


585 


NGDKADLENESCRAQVLMPVVPALWEAEGG 
G S EEPRDLRLQ * A VITPLATP A WVTQ 


795 


2145 


A 


6499 


395 


1027 "1 


KLLWLPPHSEQKRSPLYHPQGPSUriPbA^b 

SHSPPPSLLQA\PSIAAFLRTHGHISASGPLRMP 

FPH/H*NAFLLVFPGQRSQLTS/PSHYLCREVFP 

DHHHHLCRLSLESSPLFF1HRVLFCVPKQNVN 

STRAQIFCLFVHIVGCRCINTFPLHLFRLHLWL 

HFLQ1PLCKKKKSVKLGKTVVGRGCQSAAGS 

DTRVRAAVGAPGLPVEPLV 


796 


2146 


A 


6503 


68 


936 


" HS ALLTHS SFC VFTLCQDFFTYSSMSEEVT Y A 
DLQFQNSSEMEKIPEIGKFGEKAPPAPSHVWR 
PAALFLTLLCLLLLlGLG\O^SMFHVTLKlEM 
KJCMNTCLQN1SEELQRN1SLQLMSNMN1SNKJR 
NLSTTLQTIATKLCRELYSKEQEHKCKPCPRR 
WIWHKDSCYFLSDDVQTWQESKMACAAQN 
ASLLKINNKNALEFIKSQSRSYDYWLGLSPEE 
DSA r SWYESG*YNQ\PSAWVIRNAPDL>^IMY 
CGYINRLYVQYYHCTYXQRMICEKMANPVQ 

LGSTYFREA 


797 


2147 


A 


6507 


1 


881 


PGSTHASARSQVPRSAGEAAPHSRRPPCiLLPH 

APRAASAQLEERMRDPHPGMTLQEGDCRGS 

QTVSLTMGTADSDEMAPEAPQHTHIDVHIHQ 

E S ALAKLLLTCCS ALRPRATQ ARGS SRLL V AS 

WVMQIVLGILSAVLGGFFYIRDYTLLVTSGA 

AIWTGAVAVLAGAAAFrYEKRGGTYWALLR 

TLLALAAFSTAIAALKLWNEDFRYGYSYYNS 

ACRISSSSDWNTPAPTQSPEEVRRLHLCTSFM 

DMLKALFRTLQAMLLGVWILLLLASLTPLWL 

/SIVRGECSQPKG*VPKKRDQKEMLEVSGI*PG 

STHASARSQVPRSAGEAAPHSRRPPGLLPHAP 

RAASAQLEERMRDPHPGMTLQEGDCRGSQT 

VSLTMGTADSDEMAPEAPQHTHIDVHIHQES 

ALAKLLLTCCSALRPRATQARGSSRLLVASW 

\^QIVLGILSAVLGGFFYIRDYTLL\ r TSGAAI 

WTGAVAVLAGAAAFIYEKRGGTYWALLRTL 

t at a attqtataaT RTT WNEDFRYGYSYYNSAC 

RISSSSDVVNTPAPTQSPEEVRRLHLCTSFMDM 

LKALFRTLQAMLLGVWILLLLASLTPLWLYC 

WRMFPTKGVSP 


798 


2148 


A 


6528 


912 


2287 


" VPNYLPSVSSAIGGEVPQRYVWRFCIGLHSAP 
RFLVAFAYWNHYLSCTSPCSCYRPLCRLNFG 
LNVVENLALLVLTYVSSSEDF,T*WVPG*GRSG 
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/=possible nucleotide deletion, \=possible 

nnr l^r»tirip insertion 














EVFPEGTGLPLPHSDLPTSWCGHSLQCUSl^Sb 
FPP AIHEN AFI VFIAS SLGHMLLTCIL WRLTKK 
HTVSQEXDGLSLAGAPRQPRRKSRTSVLRIRV 
MVRWELSSNGNPGRGVLGLGLGLGNKLRVY 

GQNLGL* HCVWVVWETGE* KRWRLQMGEE* 

GVASRRQ*VRNSVRGLVCHNSSAPPMYMGFF 

SPTVFGGGVGG*LHVTFILHPPEVEAAGIPLLL 

GPSLPQRQGREH1W1LAAPACAPFHDR*WEP 

REIRPSP*ELGLRGEPTLSYPASCRVIRQPIP*D 

RKSYSWKQRLFIINF1SFFSALAVYFRHNMYC 

EAGWTIFAILEYTVVLTNMAFHMTAWWDF 

GNKELLITSQPEEKRF 


799 


2149 


A 


6529 


1 


874 


FFFFQRINFIEHSGSVSLLAL ACDLGWCJbU W S 

CCLVQGGGDLVDWQTNHGEDEAGGDTDSV 

DEARCKESQQEAQENLREDLCLESFAKDKIL 

QIIEGSEREHEETRTKQAALDGEPLGGGQLTA 

VHLHPSKEQQGQEGGERQRGARTHHWRGW 

EKGRRVRLRPPSGKLRADQPVRKLGGPTPS/T 

ELPGLQPHAPTPHTA/PATPTYSPAPDTPNPPV 

RWKCPLPVEPRTRQLCRERTRKACPPKPRPPL 

GLPGDPTGPVTHHAPPVSPTGASGQERRAEP 

GAVSYAHASATK 


800 


2150 


A 


6544 


2 


662 


SAQRWAAVAGRWGCRLLALLLLVPCjFCjUAS 

EITFELPDNAKQCFYEDIAQGTKCTLEFQVTTG 

GHYDVDCRLEDPDGKVLYKEMKKQYDSFTF 

TASKNGTYKFCFSNEXFSTFTHKTVYFDFQVG 

EVTHLCFLVR/DRVSALTQMESACVSIHEALKS 

VIDYQTHFRLREAQGRSRAEDLNTRVAYWSV 

nEALILLVVSIGOVFLLKSFFSDKRTTTTRVGS 


801 


2151 


A 


6556 


1 


1319 


TPCMECIKGEGLRHPQNLSGSgRKl'gi fcuarvi 

IXjWRRMPRWGLLLLLWGSCrFGLPTDTTTF 

KRIFLKRMP S IRE SLKERG VDMARLGPE WS QP 

MKRLTLGNTTSSVILTNYMDTQYYGEIGIGTP 

POTFK WFDTG SSNV WVPS SKCSRLYTACVY 

HKLFDASDSSSYKHNGTELTLRYSTGTVSGFL 

SQDIITVGGITVTQMFGEVTEMPALPFMLAEF 

DGVVGMGFIEQAIGRVTPIFDNIIS QG VLKED 

VFSFYYNRDSENSQSLGGQIVLGGSDPQHYE 

GNFHYINLIKTGVWQIQMKGVSVGSSTLLCE 

DGCLALVDTGASYISGSTSSIEKLMEALGAKE 

KRLFDYVVKCNEGPTLPPTFLFLLGGKDTPLT 

SADYLFQESYSSKKLSTLAIHAMYIPPPTGPTL 

\ AT .G ATFMRKF YTEFDRGNNPHGFALAR 


802 


2152 


A 


6567 


13 


6147 


"MCLGRMGASSPRSPEPVGPPAPGLPFCCCiU^ 
LAVVVLLALPVAWGQCNAPEWXLPFARPTNL 
TDEFEFPIGTYLNYECRPGYSGRPFSIICLKNS 
V WTG AKDRCRRKS CRNPPDP VNGMVHVDCG 
1QFGSQIKYSCTKGYRLIGSSSATCIISGDTV1W 
DNETPICDRIPCGLPPTITNGDF1STNRENFHY 
GSVVTYRCNPGSGGRKVFELVGEPSIYCTSND 
DQVGIWSGPAPQCIIPNKCTPPNVENGILVSD 
NRSLFSLNEWEFRCQPGFVMKGPRRVKCQA 
LNKWEPELPSCSRVCQPPPDVLHAERTQRDK 
t-ixttt cpnnp vfy RCV PG YDLRGAASMRCTPQG 
DWSPAAPTCEVKSCDDFMGQLLNGRVLFPV 
NLQLGAKVDFVCDEGFQLKGSSASYCVLAG 
MESLW^NSSVPVCEQIFCPSPPVIPNGRHTGKP 
LEVFPFGKAVNYTCDPHPDRGTSFDLIGESTIR 
CTSDPQGNGVWSSPAPRCGILGHCQAPDHFL 
FAKLKTQTNASDFPIGTSLKYECRPEYYGRPF 
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2153 



6574 



3233 



"Amino acid sequence (A=Alanine OCysteine, 
D*Aspartic Acid, EOlutamic Acid, 
F=Phenylalanine, Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, 
^Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y=»Ty rosin e, X=Unlcnown, *«Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 

SITCLDNLVAVSSPKDVCKRKSCKTPPDPVNG 

\mmroiQVGSRINYSCTTGHRLIGHSSAECI 

LSGNAAHWSTKPPICQRIPCGLPPT1ANGDFIS 

TNRENFHYGSWTYRCNPGSGGRKVFELVGE 

PSIYCTSNDDQVGIWSGPAPQC1IPNKCTPPNV 

ENGEL VSDNRSLF SLNE WEFRCQPGF VMKGP 

RRVKCQALNKWEPELPSCSRVCQPPPDVLHA 

ERTQRDKDNFSPGQEVFYSCEPGYDLRGAAS 

MRCTPQGDWSPAAPTCEVKSCDDFMGQLLN 

GRVLFPVNLQLGAKVDFVCDEGFQLKGSSAS 

YCVLAGMESLWNSSVPVCEQIFCPSPPVIPNG 

RHTGKPLEVFPFGKAVNYTCDPHPDRGTSFD 

LIGESTIRCTSDPQGNG VWS SPAPRCGILGHC 

QAPDHFLFAKLKTQTNASDFPIGTSLKYECRP 

EYYGRPFSITCLDNLVWSSPKDVCKRKSCKTP 

PDPVNGMVHVITDIQVGSRrNYSCTTGHRLlG 

HSSAECILSGNTAHWSTKPPICQRIPCGLPPTI 

ANGDFISTNRENFHYGSVVTYRCNLGSRGRK 

VFELVGEPSIYCTSNDDQVGIWSGPAPQCIIPN 

KCTPPNVENGILVSDNRSLFSLNEVVEFRCQP 

GFVMKGPRRVKCQALNKWEPELPSCSRVCQ 

PPPEILHGEHTPSHQDNFSPGQEVFYSCEPGY 

DLRGAASLHCTPQGDWSPEAPRCAVKSCDDF 

LGQLPHGRVLFPLNLQLGAKVSFVCDEGFRL 

KGSSVSHCVLVGMRSLWNNSVPVCEHIFCPN 

PPAILNGRHTGTPSGDIPYGKEISYTCDPHPDR 

GMTFNLIGESTERCTSDPHGNGVWSSPAPRCE 

LSVRAGHCKTPEQFPFASPTIPINDFEFPVGTS 

LNYECRPGYFGKMFSISCLENLVWSSVEDNC 

RRKSCGPPPEPFNGMVHTNTOTQFGSTVNYSC 

NEGFRLIGSPSTTCLVSGNNVTWDKKAPICEn 

SCEPPPTISNGDFYSNNRTSFHNGTVVTYQCH 

TGPDGEQLFELVGERSIYCTSKDDQVGVWSS 

PPPRCISTNKCTAP EVEN AIR VPGNRSFFSLTEI 

IRFRCQPGFVMVGSHTVQCQTNGRWGPKLPH 

CSRVCQPPPEILHGEHTLSHQDNFSPGQEVFY 

SCEPSYDLRGAASLHCTPQGDWSPEAPRCTV 

KSCDDFLGQLPHGRVLLPLNLQLGAKVSFVC 

DEGFRLKGRSASHCVLAGMKALWNSSVPVC 

EQIFCPNPPAILNGRHTGTPLGDIPYGKEVSYT 

CDPHPDRGMTFNLIGESTIRRTSEPHGNGVWS 

SPAPRCELPVGAACPHPPKIQNGHYIGGHVSL 

YLPGMTISYTCDPGYLLVGKGFIFCTDQGIWS 

QLDHYCKEVNCSFPLFMNGISKELEMKKVYH 

YGDYVTLKCEDGYTLEGSPWSQCQADDRWD 

PPLAKCTSRTHDALIVGTLSGTIFFILLIIFLSWI 

ILKHRKGNNAHENPKEVAIHLHSQGGSSVHP 

RTLQTNEEN SRVLP 

HGRSARLAAVPAEAMPGPRRPAGSRLRLLLL 

LLLPPLLLLLRGVSHAGNLTVAWLPLANTSY 

PWSWAVRVGPAVELALAQVKARPDLLPGWT 

VRTVLGSSENALGVCSDTAAPLAAVDLKWE 

HNPAVFLGPGCVYAAAPVGRFTAHWRVPLL 

TAGAPALGFGVKDEYALTTRAGPSYAKLGDF 

VAALHRRLGWERQALMLYAYRPGDEEHCFF 

LVEGLFMRVRDRLNITVDHLEF.AEDDLSHYT 

RLLRTMPRKGRVIYICSSPDAFRTLMLLALEA 

GLCGED YVFFHLDIFGQSLQGGQGPAP RRPW 

ERGDGQDVSARQAFQAAXHTYKDPDNPEYL 

EFLKQLKJILAYEQFNFTMEDGLVNTIPASFH 
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/=possible nucleotide deletion, \-possible 
nucleotide insertion 














DGLIX YIQAVTEl'LAHGGTVTDGENITQKM W 

NRSFQGVTGYLKIDSSGDRETDFSLWDMDPE 

NGAFRWLNYNGTSQELVAVSGRKLNWPLG 

YPPPDIPKCGFDNEDPACNQDHLSTLEVLALV 

GSLSLLGILIVSFFIYRKMQLEKELASELWRVR 

WEDVEPSSLERHLRSAGSRLTLSGRGSNYGSL 

LTTEGQFQVFAKTAYYKGNLVAVKRVNRKR 

IELTRKVLFELKHMRDVQNEHLTRFVGACTD 

PPNICILTEYCPRGSLQDILENESITLDWMFRY 

SLTNDIVKGMLFLHNGAICSHGNLKSSNCVV 

DGRFVLKITDYGLESFRDLDPEQGHTVYAKK 

LWTAPELLRMASPPVRGSQAGDVYSFGIILQE 

IALRSGVFHVEGLDLSPKEIIERVTRGEQPPFR 

PSLALQSHLEELGLLMQRCWAEDPQERPPFQ 

QIRLTLRKFNRENSSNILDNLLSRMEQYANNL 

EELVEERTQAYLEEKRKAEALLYQILPHSVAE 

QLKRGETVQAEAFDSVTIYFSDIVGFTALSAE 

STPMQVVTLLNDLYTCFDAVIDNFDVYKVET 

IGD A YMW SGLP VRNGRLHACE V ARMALAL 

LDAVRSFRIRHRPQEQLRLRIGIHTGPVCAGV 

VGLKMPRYCLFGDTVNTASRMESNGEALVKl 

HLSS\ETKAVL\EEFGGFELELRGDVEMKGKG 

KVRTYWLLGERG S STRG 


804 


2154 


A 


6585 


2 


3837 


DAPGRPPVRLPTMELEDGVVYQEEPGGSUAV 

MSERVSGLAGSIYREFERLIVRYDEEWKELIP 

LWAVLENLDSVFAQDQEHQVELELLRDDNE 

QLITQYEREKALRKHAEEKFIEFEDSQEQEKK 

DLQTRVESLESQTRQLELKAKNYADQISILEE 

REAELKKEYNALHQRHTEMIHNYMEHLERT 

KLHQLSGSDQLESTAHSRJRKERPISLGIFPLP 

AGDGLLTPDAQKGGETPGSEQWKFQELSQPR 

SHTSLKDELSDVSQGGSKATTPASTANSDVA 

XIPTDTPLKEENEGFVKVTDAPNKSEISKHIE V 

QVAQETRNVSTGSAENEEKSEVQAIIESTPEL \ 

DMDKDLSGYKGSSTTTKGIENKAFDRNTESL 

FEELS SAGSGLIGDVDEGADLLGMGREVENLI 

LENTQLLETKNALNIVKNDLIAKVDELTCEK 

DVLQGELEAVKQAKLKLEEKNRELEEELRKA 

RAEAEDARQKAKDDDDSDIPTAQRKRFTRVE 

MARVLMERNQYKERLMELQEAVRWTEMIR 

ASRENPAMQEKKRSSIWQFFSRLFSSSSNTTK 

KPEPPVNLKYNAPTSHVTPSVKKRSSTLSQLP 

GDKSKAFDFLSEETEASLASRREQKREQYRQ 

VKAHVQKEDGRVQAFGWSLPQKYKQVTNG 

QGENKMKNLPVPVYLRPLDEKDTSMKLWCA 

VGVNLSGGKTRDGGSWGASVFYKDVAGLD 

TEGSKQRSASQSSLDKLDQELK£QQKELKNQ 

EELSSLVWICTSTHSATKVLIIDAVQPGMLDS 

FTVCNSHVLCIASVPGARETDYPAGEDLSESG 

QVDKASLCGSMTSNSSAETDSLLGGrrVVGC 

S AEG VTG AATSPSTNG ASP VMDKPPEME AEN 

SEVDENVPTAEE\ATEATEGNAGSAJEDTV\DIS 

QTGVYTEHVFTDPLGWQIPEDLSPVYQSSND 

a ■* r T /-T->/-ito ^ n ttxtt? /~\T~u A/Dirt a nvxiQQl I PT 
SDAYKDQIS VlJ^bQULVKlinA^iUViaoi-i^r i 

MWLGAQNGCLYVHSSVAQV-TOCCLHSIKLKD 

SILSIVHVKGIVLVALADGTLAIFHRGVDGQW 

DLSNYHLLDLGRPHIiSmCMTVVHDKVWCG 

YRNKIYVVQPKAMKIEKSFDAHPRKESQVRQ 

LAWVGDGVWVSIRLDSTLRLYHAHTYQHLQ 

DVDIEPYVSKMLGTGKLGFSFVRITALMVSC 
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Y=Tyrosine, X=Unknown, *=Stop codon, 
/=possiblc nucleotide deletion, possible 
nucleotide insertion 














"NRLWVGTGNGVIlSIPLTETrVILHQGRLLGLR 
ANKTSGVPGNRPGSVIRVYGDENSDKVTPGT 
FIPYCSMAHAQLCFHGHRDAVKFFVAVPGQV 
1SPOSSSSGTDLTGDKGRGHLHRSLWRRP 


805 


2155 


A 


6605 


469 


2602 


FGRLLWGTAFKSWKMKAPIPHLILLYATJ* 1 g 

SLKVVTKRGSADGCTDWSIDIKKYQVLVGEP 

VR1KCALFYGYIRTNYSLAQSAGLSLMWYKS 

SGPGDFEEPIAFDGSRMSKEEDS1WFRPTLLQ 

DSGLYACVIRNSTYCMKVSISLTVGENDTGL 

CYNSKMKYFEKAELSKSKE1SCRDIEDFLLPT 

REPEILWKJECRTKTWRPSIVFKRDTLLIREV 

REDDIGNYTCELKYGGFWRRTTELTVTAPL 

TDKPPKLLYPMESKLTIQETQLGDSANLTCRA 

FFGYSGDVSPLIYWMKGEKFIEDLDENRVWE 

SDIMCILKEHLGEQEVSISLIVDSVEEGDLGNYS 

CYVENGNGRRHASVU.HKRELMYTVELAGG 

LGAILLLLVCLVTIYKCYKIEIMLFYRNHFGA 

EELDGDNKDYDAYLSYTKVDPDQWNQETGE 

EERFALEILPDMLEKHYGYKLFIPDRDLIPTGT 

vrenvARPVnOSKRl TTVMTPNYVVRRGWSIF 

ELETRLRNMLVTGEIKVILIECSELRGIMNYQE 

VEALKHT1KLLTVIKWHGPKCNKLNSKFWKR 

LQYEMPFKRIEPITHEQALDVSEQGPFGELQT 

VSAISMAAATSTALATAHPDLRSTFHNTYHS 

QMRQKHYYRSYEYDVPPTGTLPLTSIGNQHT 

YCNIPMTLINGQRPQTKSSREQNPDEAHTNSA 

ILPLLPRETSISSVIW 


806 


2156 


A 


6614 


3 


1584 


NSARGGVG\01GARAMATVQEKAAALNLSAL 

HSPAHRPPGFSVAQKPFGATYVWSSIINTLQT 

QVEVKKRRHRLKRHNDCFVGSEAVDVIFSHL 

1QNKYFGDVDIPRAKVVRVCQALMDYKVFE 

AVPTKVFGKDKKPTFEDSSCSLYRFTTIPNQD 

S QLGKENKLYSPARY AD ALFKSSDIRS ASLED 

LWENLSLKPANSPHVNISTTESPQVINEVWQE 

ETIGRLLQLVDLPLLDSLLKQQEAVPKJPQPK 

RQSTMVN S SNYLDRGILKA Y SDSQEDE WLS A 

Ainn EYLPDOMWEISRSFPEQPDRTDLVKE 

LLFDAIGRYYSSREPLLNHLSDVHNGIAELLV 

NGKTEIALEATQLLLKLLDFQNREEFRRLLYF 

MAVAANPSEFKLQKJESDNRMVVKRTFSKATV 

DNKNLSKGKTDLLVLFLVMDHQKDVFKIPGT 

LIHKIVSVVTCVLMAIQNGRDPNRDAGYIYCQRJ 

DQRDYSNITEKTTIDELLYLLKTLDEDSKLSA 

KEKKKVLLGQFYKCHPDIFIEHEGD 


807 


2157 

1 

I 


A 


6615 


4198 


2094 


FGIVGTFALETDELDSDRDPAIFSLCDFGAMR 

PQILLLLALLTLGLAAQHQDKVPCKMAOCML 

CPDRVDKKVSCQVLGLLQVPSVLPPDTETLD 

LSGNQLRSILASPLGFYTALRHLDLSTNEISFL 

QPG AFQALTHLEHLSLAHNRL AMATALS AG 

GLGPLPRVTSLDLSGNSLYSGLLERLLGEAPS 

LHTLSLAENSLTRLTRHTFRDMPALEQLDLHS 

NVLMDIEDGAFEGLPRLTHLNLSRNSLTCISD 

FSLQQLRVLDLSCNSIEAFQTAS\QPQAEFQLT 

WLDLRENKLLHFPDLAALPRLIYLKLSNKLIR 

LPTGPPQDSKGIHAPSEGWSALPLS\APSGNAS 

GRPLSQLLNLDLSYNEIELIPDSFLEHLTSLCFL 

NLSRNCLRTFEARRLGSLPCLMLLDLSHNALE 

TLELGARALGVSLRTLLLQGNALRDLPPYTFA 

NLASLQRLNLQGNRVSPCGGPDEPGPVSGCVN 

AFSGITSLRSLSL\T)NEIELLRAGAFLHTPLTE 
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nucleotide 

location 

correspondi 
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to last amino 
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of peptide 
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Amino acid sequence (A=Alanine C=Cysteine, 
D=Aspartic Acid, EOlutamic Acid, 
F=PhenyIalanine, GOlycine, H=Histidine, 
I~lsoleucine, K=Lysine, L-Leucine, 
M=Methioninc, N=Asparagine, P=Prolinc, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, possible 
nucleotide insertion 














LDLSSNPGLEVATGALGGLEASLEVLALQGN 

GLMVLQVD1PCHCLKRLNLAENRLSHLPAW 

TQAVSLEVLDLRNNSFSLLPGSAMGGLETSLR 

RLYLQGNPLSCCGNGWLAAQLHQGRVDVDA 

TQDLICRFSSQEEVSLSHVRPEDCEKGGLKNI 

NLniLTFlLVSAlLLTTLAACCCVRRQKFNQQ 

YKA 


808 


2158 


A 


6619 


153 


1852 


FKALSQYIYTNTHLEREAAFEVAILLRRMEbU 

ARHRNNTEKKHPGGGESDASPEAGSGGGGV 

ALKKElGLVSACGIlVGNnGSGIFVSPKGVLEN 

AGSVGLALIVWIVTGF1TWGALCYAELGVNI 

PKSGGDYFYVKDIFGGLAGFLRLW1AVLVIYP 

TNQAV1ALTFSNYVLQPLFPTCFPPESGLRLLA 

AICLLLLTWVNCSSVRWATRVQDIFTAGKLL 

ALALIIIMGIVQICKGEYFWLEPKNAFENFQEP 

DIGLVALAFLQGSFAYGGWNFLNY\VTEELV 

DP\ YKN1APRA I FT S I P\L VTFVYVF ANV / AL YVT 

AMSPQEL\LAS\NAVAVTFGEKLLGVMAWIM 

PISVALSTFGGVNGSLFTSSRLFFAGAREGHLP 

SVLAM1HVKRCTPIPALLFTC1STLLMLVTSD 

MYTLINYVGFINYLFYGVTVAGQIVLRWKKP 

DIPRPIKINLLFP1IYLLFWAFLLVFSLWSEPW 

CGIGLAIMLTGVPVYFLGVYWQHKPKCFSDFI 

ELLTLVSQKMCVVVYPEVERGSGTEEANED 

MEEQQQPMYQPTPTKDKDVAGQPQP 


809 


2159 


A 


6621 


1041 


223 


QDSRKMLPSTSVNSLVQGNGVLNSRDAARH 

TAGAKRYKYLRRLFRFRQMDFEFAAWQMLY 

LFTSPQRVYRNFHYRKQTKDQWARDDPAFL 

VLLSIWLCVST1GFGFVLDMGFFETIKLLLWY 

VLIDCVGVGLLIATLMWFISNKYLVKRQSRD 

YDVEWGYAFDVHLNAFYPLLVILHFIQLFFIN 

HVILTDTFIGYLVGNTLWLVAVGYYTYVTFL 

GYSVGLLFFSXALPFLKKTVILLYPFAPLILLYG 

LSLALGWNFTHTLCSFYKYRVK 


810 


2160 


A 


00/.J 


160 


822 


SPASGHCRLNGAAVAMFGCLVAGRLVQTAA 
QQVAEDKFVFDLPDYESINHVVVFMLGTIPFP 
EGMGGSVYFSYPDSNGMPVWQLLGFVTNGK 
PSAIFKISGLICSGEGSQHPFGAMNIVRTPSVAQ 
IGISVELLDSMAQQTPVGNAAVSSVDSFTQFT 
QKMLDNFYNFASSFAVSQ/VPDDTQ/RPSEMF 
IPANVVLKWYENFQRRTSTEPSLLENIIW1KIN 

F 


811 


2161 

i 

i 


A 


6627 


18 


3367 


' LEGSLNTERAKYYLTrTMPHFTVTKVEDPEliU 
AAASISQEPSLADDCARIQDSDEPDLSQNSITG 
EHSQLLDDGHKKARNAYLNNSNYEEGDEYF 
DKNLALFEEEMDTRPKVSSLLNRMAKYTNLT 
QGAKEHEEAENITEGKKKPTKTPQMGTFMG 
VYLPCLQNIFGVILFLRLTWVVGTAGVLQAF 
AJVLICCCCTMLTAISMSAIATNGWPAGGSY 
FM1SRALGPEFGGAVGLCFYLGTTFAAAMYIL 
GAIEIFLVYIVPRAAIFHSDDALKESAAMLKN 
MRVYGTAFLVLMVLVVF1GVRYVNKFASLFL 
AC VI V S IL AI Y A G AIKS SF AP P HFPVCML GNRT 
t ecDUTnvr^KTTCFrNTNMTWSICLWGFFCNSS 
QFFNATCDEYFVHNNVTSIQGIPGLASGIITEN 
LWSNYLPKGEIIEKPSAKSSDVLGSLNHEYVL 
VDITTSFTLLVGIFFPSVTGIMAGSNRSGDLKD 
AQKS1PIGTILAILTTSFVYLSNVVLFGACIEGV 
VLRDKJFGDAVKGNLWGTLSWPSPWVIV1GS 
FFSTCGAGLQSLTGAPRLLQAIAKDMIPFLRV 
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Amino acid sequence (A=Alamne C=Cysteine, 
D=Aspartic Acid, E^Glutamic Acid, 
F=Phenylalanine, G=01ycine, H=Histidine, 
I=Isoleucine, K™L.ysinc, l— leucine, 
M=Mcthionine, N=Asparagine, P-Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=V aline, W=Tryptophan, 
Y=Tyrosine, X= Unknown, *=Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 














FGHSKANGEPTV/ALLXTAAIAELGIL1ASLDL 

VAPILSMFFLMCYLFVNLACALQTLLRTPNW 

RPRFRYYHWALSFMGMSICLALMFISSWYYA 

IVAMVIAGM1YKYIEYQGAEKEWGDGIRGLS 

LSAARFALLRLEEGPPHTKNWRPQLLVLLKL 

DEDLHVKHPRLLTFASQLKAGKGLTIVGSV1V 

GNFLENYGEALAAEQTIKHLMEAEKVKGFCQ 

LVVAAKLREG1SHLIQSCGLGGMKHNTVVM 

GWPNGWRQSEDARAWKTFIGTVRVTTAAHL 

ALLVAKNISFFPSNVEQFSEGNIDVWWIVHDG 

GMLMLLPFLLK\QHKVWRKCSIRFF\TVAQLE 

DNSlQMKKDLATFLYHLRIEAEVEYVbMHDb 

DISAYTYERTLMMEQRSQMLRHMRLSKTER 

DREAQLVKDRNSMLRLTSIGSDEDEETETYQ 

EKVHMTWTKX)KYMASRGQKAKSMEGFQDL 

LNMRPDQSNVRRMHTAVKLNEVTVNKSHEA 

KLVLLNMPGPPRNPEGDENYMEFLEVLTEGL 

ERVLLVRGGGSEVTTIYS 


812 


2162 


A 


6628 


66 


640 


AVCTMSEMAELSELYEESSDLQMDVMPGEG 

DLPQMEVGSGSRELSLRPSRSGAQQLEEEGP 

MEEEEAQPMAAPEGKRSLANGPNAGEQPGQ 

VAGADFESEDEGEEFDDWEDDYDYPEEEQLS 

GAGYRVSAALEEADKMFLRTREPALDGGFQ 

MHYEKTProQLAEIEELFASLMVYNRLTEELG 

rnFTTDRE 


O 1 J 


2163 


A 


6630 


708 


1355 


AKMGAYKYIQELWRKKQSDVMRFLLRVKC 

WQYRQLSALHRAPRPTRPDKARRLGYKAKQ 

GY/VY1YIGFVFAV1YRIRVRRGGRKRPVPKG 

ATYGKPVHHGVNQLKFARSLQSVAEERAGR 

HCGALRVLNSYWVGEDSTYKFFEVILIDPFHK 

A IRRNPDTQ WrTKP VKKHREMRGLTS AGRKS 

RGLGKGHKFHHTIGGSRRAAWRRRNTLQLH 

RYR 


814 


2164 


A 


6635 


201 


1705 


KGTEMNKSRWQSRRRHGRRSHQQNPWh'RLR 

DSEDRSDSRAAQPAHDSGHGDDESPSTSSGT 

AGTSSWELPGFYFDPEKKRYFRLLPGHNNCN 

PLTKESIRQKiMESK^RLLQEEDRRKKJARM 

GFNASSMLRKSQLGFLNVTNYCHLAHELRLS 

CMERKKVQ IRSMDPS AL ASDRFNLIL ADTN S 

DRLFTVNDVTVGGSKYGIINLQSLKTPTLKVF 

MHEKLYrTmKV^NSVCWASLNHLDSHILLC 

LMGLAETPGCATLLPASLFVNSHPAGIDRPGN 

MLCSFRIPGAWSCAWSLNIQANNCFSTGLSR 

RVLLTNVVTuHRQbr 0 1 iNMJ v w\vvr/Y±^vLf\ 

PLLFNGCRSGEIFAIDLRCGNQGKGWKATRLF 

HDSAVTSVRILQDEQYLMASDMAGKIKLWD 

LRTTKCVRQYEGHVNEYAYLPLHVHEEEGIL 

VA VGQDC YTRI W SLHDARJLLRTTPSPYPASKA 

DIPS V AFS SRLGGSRG APGLLMAVGQDL YC Y 

SYS 


815 


2165 

1 


A 


6643 


659 


3282 


NKNILEVPSARTTRIMGDHLDLLLGVVLNiAU 

pVFGIPSCSFTCRIAfYRECNLTQVPQVLNTTE 

RIXLSFNYIRTVTASSFPFLEQLQLLELGSQYT 

PLTIDKEAFRNLPNLR1LDLGSSKIYFLHPDAF 

QGLFHLFELRLYFCGLSDAVLKDGYFRNLKA 

LTRLDLSKNQIRSLYLHPSFGKLNSLKSIDFSS 

NQIFLVCEHELEPLQGKTLSFFSLAANSLYSR 

VSVDWGKCMNPFRNMVLEILDVSGNGWTV 

DITGNFSNAISKSQAFSLILAHHIMGAGFGFHN 

IKDPDQNTFAGLARSSVRHLDLSHGFVFSLNS 
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NO: of 
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seq- 
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( NO: of 
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816 



Met 

hod 



SEQ 


Predicted 


ID NO: | 


beginning 
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nucleotide 
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09/496 


correspond i 


914 


ng to first 




amino acid 




residue of 




peptide 




sequence 



2166 



811 



2167 



6646 



6649 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 



3811 



63 



1073 



Aminoacid sequence (A=Alanine OCysteine, 
D=Aspartic Acid, E=*Glutamic Acid, 
F=Phenylalanine, OOlycine, H=Histidine, 
l=Isoleucine, K=Lysine, L=Lcucine» 
M-Methionine, N=Asparagine, P=Proline, 
Q=Glutamirie, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan ; 
Y=Tyrosine, X=Unknown, '-Stop codon, 
/=possible nucleotide deletion, \=possible 

nucleotid e insertion 

RVFETLKDLKVLNLAYNTCJNKIADEAFYGLD 

NLQVLNLSYNLLGELYSSNFYGLPKVAY1DL 

QKNHIAIIQDQTFKFLEKLQTLDLRDNALTTIH 

FIPSIPD1FLSGNKLVTLPKINLTANLIHLSENR 

LENLDILYFLLRVPHLQILILNQNRFSSCSGDQ 

TPSENPSLEQLFLGENMLQLAWETELCWDVF 

EGLSHLQVLYLNHNYLNSLPPGVFSHLTALR 

GLSLNSNRLTVLSHNDLPANLEILDISRNQLL 

APNPDVFVSLSVLDITHNKFICECELSTFINWL 

N^TNVTIAGPPADIYCS/TPDSLSGVSLFSLSTE 

GCDEEEVLKSLKFSLFWCTVTLTLFLN1TILTV 

TKFRGFCFICYKTAQRLVFKDHPQGTEPDMY 

KYDAYLCFSSKDFTWVQNALLKHLDTQYSD 

QNRFNLCFEERDFVPGENRP\ANIQDAIWNSR 

KIVCLVSRHFLRDGWCLEAFSYAQGRCLSDL 

NSALIMVWGSLSQYQLMKHQSIRGFVQKQQ 

YLRWPEDLQDVGWFLHKLSQQILKKEKEKK 

KDNNIPLQTVATIS 

RDRAGVRPAGKQHAAAAFYDVGGDRPWDS 

GNTQLPPRNPVKANAMFGAGDEDDTDFLSPS 

GGARLASLFGLDQAAAGHGNEFFQYTAPKQP 

KKGQGTAATGNQ ATPKTAPATM STPTIL V AT 

AVHAYRYTNGQYVKQGKFGAAVLGNHTTR 

EYRILLYISQQQPVTVARIHVNFELMVRPNNY 

STFYDDQRQNWSIMFESEKAAVEFNKQVCIA 

KCNSTSSLDAVLSQDLIVADGPAVEVGDSLE 

VAYTGWLFQNHVLGQVFDSTANKDKLLRLK 

LGSGKVTKGWEDGMLGMKKGGKRLL1VPPA 

CAVGSEGVIGWTQATDSILVFEVEVRRVKIA 

KDSGSDGHSVSSRDSAAPSPIPGADNLSADPV 

VSPPTSIPFKSGEPALRTKSNSLSEQLAINTSPD 

AVKAKL1SRMAKMGQPMLPILPPQLDSNDSEJ 

EDVNTLQGGGQPWTPSVQPSLQPAHPALPQ 

MTSQAPQPSVTGLQAPSAALMQVSSLDSHSA 

VSGNAQSFQPYAGMQAYAYPQASAVTSQLQ 

PVRPLYPAPLSQPPHFQGSGDMASFLMTEAR 

QHNTEIKMAVSKVADKMDHLMTKVEELQKH 

SAGNSMLIPSMSVTMETSMIMSNIQRIIQENER 

LKQE1LEKSNRIEEQNDKISELIERNQRYVEQS 

NLMMEKRNNSLQTATENTQARVLHAEQEKA 

KVTEELAAATAQVSHLQLKMTAHQKKETEL 

QMQLTESLKETDLLRGQLTKVQAKLSELQET 

SEQAQSKFKSEKQNRKQLELKVTSLEEELTDL 

RVEKESLEKNLSERKKKSAQERSQAEEEIDE1 

RKSYQEELDKLRQLLKKTRVSTDQAAAEQLS 

LVQAELQTQWEAKCEHLLASAKDEHLQQYQ 

EVCAQRDAYQQKLVQLQEKSVCFAVCLALQA 

QITALTKQNEQHIKELEICNKSQMSGVEAAAS 

DPSEKVKKJMNQVFQSLRREFELEESYNGRTI 

LGTIMNTIKMVTLQLLNQQEQEKEESSSEEEE 

EKAEERPRRPSQEQ S AS ASSGQPQ APLNRERP 

ESPMVPSEQWEEAVPLPPQALTTSQDGHRR 

KGDSEAEALSEIKDGSLPPELSCIPSHRVLGPP 

TS1PPEPLGPVSMDSECEESLAASPMAAK\PDN 

P SGKWC VREVAPDGPLQESSTRLSLTSNDPEE 

GDPLALGPESPGEPQPPQLKKDDVTSSTGPHK 

ELSSTEAGSTVAGAALRPSHHSQRSSLSGDEE 

DELFKGATLKALRPKAQPEEEDEDEVSMKGR 

PPPTPLFGDDDDDDDIDWLG 

FFRS SSDNGSPIRQ YE/HSTPAHQGPVMGLEG 
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D=Aspartic Acid, E=Glutamic Acid, 
F-Phenylaianinc, OKjlycinc, H-Histidinc, 
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Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V- Valine, W=Tryptophan, 
Y^Tyrosine, X^Unknown, *=Stop codon, 
/"-possible nucleotide deletion, V*possible 
nucleotide insertion 














KS/ARNSQLRIVLVGKTGAGKSATGNSILGRK 

VFHSGTAAKSITKKCEKRSSSWKETELVWD 

TPGIFDTEVPNAETSKEIIRCILLTSPGPHALLL 

WPLGRYTEEEHKATEKILKMFGERARSFMIL 

IFTRKDDLGDTNLHDYLREAPEDIQDLMDIFG 

DRYCALNNKATGAEQEAQRAQLLGLIQRW 

RENKEGCYTNRMYQRAEEEIQKQTQAMQEL 

HRVELEREKARIREEYEEKIRKLEDKVEQEKR 

KKQMEKKLAEQEAHYAVRQQRARTEVESKD 

GILELIMTALQ1ASFILLRLFAED , 


818 


2168 


A 


6660 


357 


1890 


APSGSWTRVVLTLDPCSLRSRSPRSLLDPGMP 

GI S ARGLSHEGRKQLAVNLTRVL ALYRSILD A 

YIIEFBTDNLWDTLPCSWQEALDGLKPPQLA 

T^LGMPGEGEVVRYRSVWPLTLLALKSTA 

CALAFTRMPGFQTTSEFLENPSQSSRLTAPFR 

KHVRPKKQHEIRRLGELVKKLSDFT/GLHPGC 

RRGLRPG\HLSRFMALGLGLMVKS1EGDQRL 

VERAQRLDQELLQALEKEEKRNPQVVQTSPR 

HSPHHVVRWVDPTALCEEJLLLPLENPCQGRA 

RLLLTGLHACG\DLSVALLRHFSCCPEWALA 

SVGCCYMKLSDPGGYPLSQWVAGLPGYELP 

YRLREGACHALEEYAERLQKAGPGLRTHCY 

RAALETVIRRARPELRRPGVQGIPRVHELKIEE 

YVQRGLQRVGLDPQLPLNLAALQAHLAQEN 

RWAFFSLALLLAPLVETLILLDRLLYLQEQA 

LSP\GFHAELLPIFSPELSPRNLVLVATKMPLG 

QALSVLETEDS 


819 


2169 


A 


6661 


65 


2686 


SGSGHCLAEAASMGPWGWKLRWTVALLLA 

AAGTAVGDRCERNEFQCQDGKCISYKWVCD 

GSAECQDGSDESQETCLSVTCKSGDFSCGGR 

VNRCEPQFWRCDGQVDCDNGSDEQGCPPKTC 

SQDEFRCHDGKCISRQFVCDSDRDCLDGSDE 

ASCPVLTCGPASFQCNSSTCIPQLWACDNDPD 

CEDGSDEWPQRCRGLYVFQGDSSPCSAFEFH 

CLSGECIHS S WRCDGGPDCKDKSDEENC AVA 

TCRPDEFQCSDGNCIHGSRQCDREYDCKDMS 

DEVGCVNVTLCEGPNKFKCHSGECITLDKVC 

NMARDCRDWSDEPrKECGTNECLDNNGGCS 

HVCNDLKIGYECLCPDGFQLVAQRRCEDIDE 

CQDPDTCSQLCVNLEGGYKCQCEEGFQLDPH 

TKACKAVGSIAYLFFTNRHE\TIKMTLDRSEY 

TSLIPNLRNVVALDTEVASNRIYWSDLSQRMI 

CSTQLDRAHGVSSYDTVISRDIQAPDGLAVD 

WmSNIYWTDSVLGTVSVADTKGVKRKTLFR 

ENG SKPRAIVVDP VHGFMYWTDWGTP AKIK 

KGGLNGVDIYSLVTENIQWPNGITLDLLSGRL 

YWVDSKLHSISSIDVNGGNRKTILEDEKRLAH 

PFSLAVFEDKVFWTDITNEAIFSANRLTGSDV 

NLLAENLLSPEDMVLFHNLTQPRGVNWCERT 

TLSNGGCQYLCLPAPQINPHSPKJFTCACPDGM 

LLAR\DMRSCLTEG\EAAVATQETSTVRLKVS 

STAVRTQHTTTRPVPDTSRLPGATPGLTTVE1 

VTMSHQALGDVAGVRGhAEKKPSSVRALSIVL 

urvAi T vrpT n nVFl T WKMWRI KNINSINFDNP 

VYQKTTEDEVHICHNQDGYSYPSRQMVSLED 
DVA 


820 


2170 


A 


6666 


17 


4146 


ERGISSQ1KGMKSGSGGGSPTSLWGLLFLSAA 
LSLWPTSGE1CGPGIDIRNDYQQLKRLENCTVI 
EGYLHTLLISKAEDYRSYRFPKLTVITEYLLLF 
RVAGLESLGDLFPNLTVIRG\VXLFYNYALVIF 
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Q=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, Vpossible 
nucleotide insertion 














EMTNLKDIGLYNLRNITRGVAIRIEKNADLCYL 

STVDWSLILDAVSNNYIVGNKPPKECGDLCP 

GTMEEKPMCEKTTINNEYNYRCWTTNRCQK 

MCPSTCGKRACTENNECCHPECLGSCSAPDN 

DTACVACRHYYYAGVCVPACPPNTYRFEGW 

RCVDRDFCAN1LSAESSDSEGFVTHDGECMQE 

CPSGF1RNGSQSMYCIPCEGPCPKVCEEEKKT 

KTIDSVTSAQMLQGCT1FKGNLLINIRRGNNIA 

SELENFMGLIEWTGYVKIRHSHALVSLSFLK 

NLRLILGEEQLEGNYSFYVLDNQNLQQLWD 

WDHRNLTIKAGKMYFAFNPKLCVSEIYRMEE 

VTGTKGRQSKGDINTRMNGERASCESDVLHF 

TSTTTSKNRJIITWHRYRPPDYRDLISFTVYYK 

EAPFKNVTEYDGQDACGSNSWNMVDVDLPP 

NKDVEPGILLHGLKPWTQYAVYVKAVTLTM 

VENDHIRGAKSEILYIRTNASVPSIPLDVLSAS 

NSSSQLIVKWNPPSLPNGNLSYYTVRWQRQP 

QDGYLYRHNYCSKDKIPIRKYADGTIDIEEVT j 

ENPKTEVCGGEKGPCCACPKTEAEKQAEKEE 

AEYRKVFENFLHNS1FVPRPERKRRDVMQVA 

NTTMSSRSRNTTAADTYNITDPEELETEYPFF 

ESRVDNKERTV1SNLRPFTLYRIDIHSCNHEAE 

KLGCSASNFVFARTMPAEGADDrPGPVTWEP 

RPENSIFLKWPEPENPNGLILMYEIKYGSQVE 

DQRECVSRQEYRKYGGAKLNRLNPGNYTARI 

QATSLSGNGSWTDPVFFYVQAKRYENFIHLII 

ALPVAVLLIVGGLVTMLYVFHRKRNNSRLGN 

GVLYASVNPEYFSAADVYVPDEWEVAREKJT 

MSRELGQGSFGMVYEGVAKGVVKDEPETRV 

AIKTVNEAASMRERIEFLNEAS\ r MKEFNCHH 

WRLLG WSQGQPTL V1MELMTRGDLKS YLR 

SLRPEMENNPVLAPPSLSKMIQMAGEIADGM 

A YLN ANKFVHRDL AARNCMV AEDFTVKIGD 

FGMTRDIYETDYYRKGGKGLLPVRWMSPESL 

KDGVFTTYSDVWSFGWLWEIATLAEQPYQ 

GLSNEQVLRFVXMEGGLLDKPDNCPDMLFEL 

MRMCWQYNTKMRPSFLEIISSIKEEMEPGFRE 

VSFYYSEENKLPEPEELDLEPENMESVPLDPS 

ASSSSLPLPDRHSGHKAENGPGPGVLVLRASF 

DERQPYAHMNGGRKNERALPLPQSSTC 


821 


2171 


A 


ooy i 


106 


825 


GRVLFRGCGVGHKGQVLMGTFILAQDWLSE 
SNHVFCVSSMLRLQKRLASSVLRCGKKKVW 
LDPNETNELANANSRQQIRKLIKDGLIIRKPVT 
VH S RARCRKNTL ARRXGRHMGIGKRKGT AN 
ARMPEKVTWMRRMRILRRLLRRYRES/KRYR 
E SKKIDRHM YH SL YLK VKGNV7KNKRILMEH 
IHKJ.KADKARKJaL^QAEARJ^KTKEA^ 
RREERLQAKKEEEQCTLSKEEETKK 


s22 


21 // 


A 


6715 


772 


21 


"DFRPGLLLPRKXKMFGFHKPKMYRSIEUaCI 
SGAKSSSS\RFTDSKRYEK\DFQ\SCFGLHETR\ 
SGDI\CNA\CVLL\LKRWKKLPAGSKKVNWNH 
WDARAGPSVLKTTLKPKXVKTL\SGNRIK\ST 
QISKLQKEFKR\HNSDAHSTTS\SASP\AQSPLF 
mrvriTTt? urrn^riTn vnFPGSNRNHPVFSFLDL\ 
T Y WKRQKJ CCGJu* YKGRFGE VLIDTHLFKPCC 
SN1G<A\AAEKPEEQGPEPLPISTQEWVTEVFM 


' 823 
i ._ . 


2173 


A 


6727 


3 


4063 


" P YL ATLQLD S S LLCPPKYQTPP AAAQGQ ATPG 
NAGPLAPNGSAAPPAGSAFNPTSNSSSTNPAA 

! SSSASGSSVPPVSSSASAPG1SQISTTSSSGFSGS 
VGGQNPSTGGISADRTQGNIGCGGDTDPGQS 
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824 
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Predicted end 
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to last amino 
acid residue 
of peptide 
sequence 



6732 



2440 



365 



Amino acid sequence (A=Alanine OCysteine, 
T>-Aspartic Acid, E=Glutaraic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L= Leucine, 
M-Methionine, N-Asparagine, P^Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *^Stop codon, 
/^possible nucleotide deletion, V=possible 

nucleot ide insertion 

"SSQPSQDGQESNVPSVGSLADPDYLMTPl^MN 
TPVTLNSAAPASNSGAGVLPSPATPRFSVPTP 
RTPRTPRTPRGGGTASGQGSVKYDSTDQGSP 
AJSTPSTTRPLNSVEPATMQPEPEAHSLYVTLIL 
SDSVMNIFKDRNEDSCC1CACNMNIKGADVG 
LYtPDSSNEDQYRCTCGFSAlMNRKLGYNSGL 
FLEDELDIFGKNSDIGQAAERRLM\MCQSTFL 
PQVEGTKKPQEPP1SLLLLLQNQHTQPFASLN 
FLDY1SSNNRQTLPCVSWSYDRVQADNNDY 
WTECFNALEQGRQYVDNPTGGKVDEALVRS 
ATVHSWPHSNVLDISMLSSQDWRMLLSLQP 
FLQDAIQKKRTGRTWENIQHVQGPLTWQQFH 
KMAGRGTYGSEESPEPLPIPTLLVGYDKDFLT 
ISPFSLPFWERLLLDPYGGHRDVAYTVVCPEN 
EALLEGAKTFFRDLSAVYEMCRLGQHKPICK 
VUIDGLMRVGKTVAQKLTDELVSEWFNQPW 
SGEENDNHSRLKLYAQVCRHHLAPYLATLQL 
DSSLLIPPKYQTPPAAAQGQATPGNAGPLAPN 
GSAAPPAGSAFNPTSNSSSTNPAASSSASGSSV 
PPVSSSASAPGISQISTTSSSGFSGSVGGQNPST 
GGISADRTQGNIGCGGDTDPGQSSSQPSQDG 
QESVTERERIGIPTEPDSADSHAHPPAWrYM 
VDPFTYAAEEDSTSGNFWLLSLMRCYTEMLD 
NLPEHMRNSFILQIVPCQYMLQTMKDEQVFY 
IQYLKSMAFSVYCQCRRPLPTQIHIKSLTGFGP 
AASIEMTLKNPERPSP1QLYSPPFILAPIKDKQT 
ELGETFGEASQKYNVLFVGYCLSHDQRWLL 
ASCTDLHGELLETCVVNIALPNRSRRSKVSAR 
KIGLQKLWEWCIGIVQMTSLPWRWIGRLGR 
LGHGELKDWSILLGECSLQTISKKLKDVCRM 
CGISAADSPSILSACLVAMEPQGSFWMPDAV 
TMGSVFGRSTALNMQSSQLNTPQDASCTHIL 
VFPTSST1QVAPANYPNEDGFSPNNDDMFVDL 
PFPDDMDNDIGILMTGNLHSSPNS SP VPSPGSP 
SGIGVGSHFQHSRSQGERLLSREAPEELKQQP 
LALGYFVSTAKAENLPQWFWSSCPQAQNNQC 
PLFLKASLHHMSVAQTDELLPARNSQRVPHP 
LDSKTTSDVLRFVLEQYNALSWLTCNPATQD 

RTSC LPVHFWLTQLYNAIMNIL __ 

VEEGLGRRRTPPGGRRGPVTPARPGPDSVKK 

RLLPPSSAAAFSSHRHNLLCSRRRGGGGGGG 

GGGGGTIKRPGITGPTAATSPSGEPGNAASAP 

LSLLSPFPGQTTYQHPGVAEPSAYGGRDVAC 

ASL VFGRLQHRGGDRKRGLLGRS SGDAASD 

QPFRCRSGSTAGRLVKQMDFTEAYADTCSTV 

GLAAREGNVKVLRKLLKKGRSVDVADNRG 

WMPIHEAAYHNSVECLQMLINADSSENYIKM 

KTFEGFCALHLAASQGHWKJVQILLEAGADP 

NATTLEETTPLFLAVENGQIDVLRLLLQHGAN 

VNGSHSMCGWNSLHQASFQENAEIIKLLLRK 

GANKECQDDFGITPLFVAAQYG\KLESL\S1LIS 

SGVANVNCQALDKATPLFIAAQEGHTKCVELL 

LSSGADPDLYCNEDSWQLPIHAAAQMGHTKJ 

LDLLIPLTNRACDTGLNKVSPVYSAVFGGHE 

DCLEILLRNGYSPDAQACLVFCFSSPVCMAFQ 

KDCEFFGIVNILLKYGAQINELHLAYCLKYEK 

FSIFRYFLRKGCSLGPWNHIYEFVNHAIKAQA 

KYKEWLPHLLVAGFDPLILLCNSWIDSVSrDT 

LIFTLEFTNWKTLAPAVERMLSARASNAWIL 

QQHIATVPSLTHLCRLEIRSSLKSERLRSDSYTS 
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Amino acid sequence (A= Alanine C=Cysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K-Lysine, L=Leucine, 
M=Mcthionine, N^Asparagine, P=Prolinc, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/possible nucleotide deletion, \=possibie 
nucleotide insertion 














QLPLPRSLHNYLLYEDVLRMYEVPELAAIQD 
G 


825 


2175 


A 


6735 


277 


1252 


RIMGLFDRGVQMLLTTVGAFAAFSLMT1AVG 

TDYWLYSRGVCKTKSVSENETSKKNEEVMT 

HSGLWRTCCLEGNFFCGLCKQIDHFPEDADYE 

ADTAEYFLRAVRASSIFPILSVILLFMGGLCIA 

ASEFYKTRHNIILSAGIFFVSAGLSNIIGnVYIS 

ANAGDPSKSDSKKNSYSYGWSFYFGALSF11A 

EMVGVLAVHMFIDRHKQLRATARAYTDYLQ 

ASAITRDPSYRYRYQRRSRSSSRSTEPSHSRDA 

SPVGIKGFNTLPSTEISMYTLSRDPLKAATTPT 

ATYNSDRDNSFLQVHNCIQKENKDSLHSNTA 

NRRTTPV 


826 


2176 


A 


6744 


3 


5177 


SDDLRTGLFQDVQDAESLKLPGVYEVLFYNE 

TEDCPGMMLWRYPEPRGLTLVRTTPVPFNTT 

EDPDISTADLGDVLQDPCSLEYWDELQKVFV 

AFREFNLSESKVCELQLPDINLVNDQKKLVSS 

DLWRIVLNSSQNGADDQSSASESGSQSTCDPL 

VTPTALAACTRVDSCFTPWFVPSLCVSFQFAH 

LEFHLCHHLDQLGTAAPQYLQPFVSDRNMPS 

ELEYMIVSFREPHMYLRQWNNGSVCQEIQFL 

AQADCKLLECRNVTMQSVVKPFSIFGQMAVS 

SDVVEKLLDCTVrVDSVFVNLGQHWHSLNT 

AIQAWQQNKCPEVEELVFSHFVIOJDTQETL 

RFGQVDTDENBLLASLHSHQYSWRSHKSPQL 

LHICIEGWGNWRWSEPFSVDHAGTFIRTIQYR 

GRTASLIIKVQQLNGVQKQ[IICGRQI1CSYLSQ 

SIELKWQHYIGQDGQAVVREHFDCLTAKQK 

LPSYILENNELTELCVKAKGDEDWSRDVCLE 

SKAPEYSIV1QVPSSNSSIIYVWCTVLTLEFNS 

Q VQQRMIVFSPLFIMRSHLPDP IIIHLEKRSLGL 

SETQIIPGKGQEKPLQNIEPDLVHHLTFQAREE 

YDPSDCAVPISTSLIKQIATKVHPGGTVNQILD 

EFYGPEKSLQPIWPYNKKDSDRNEQLSQWDS 

PMRVKLS1WKPYVRTLLIELLPWALLINESKW 

DLWLFEGEKIVLQVPAGKIIIPPNFQEAFQIGIY 

WANTNTVHKSVAHCLVHNLTSPKWKDGGNG 

EWTLDEEAFVDTEIRLGAFPGHQKLCQFCIS 

SMVQQGIQ1IQIEDKTTIINNTPYQIFYKPQLSV 

CNPHSGKEYFRVPDSATFSICPGGEQPAMKSS 

SLPCWDLMPD1SQSVLDASLLQKQIMLGFSPA 

PGADSSQCWSLPAIVRPEFPRQSVAVPLGNFR 

ENGFCTRAIVLTYQEHLGVTYLTLSEDPSPRV 

IIHNRCPVKMLIKENIKD1PKFEVYCKKIPSECS 

IHHELYHQISSYPDCKTTCDLLPSLLLRVEPLDE 

VTTEWSDAIDINSQGTQVVFLTGFGYVYVDV 

VHQCGTVFITVAPEGKAGPLLrNTNRAPEKiV 

TF/KMFITQLSLAWDDLTHHKASAELLRLTL 

DNIFLCVAPGAGPLPGEEPVAALFELYCVEIC 

CGDLQLDNQLYNKSNFHFAVLVCQGEKAEPI 

QCSKMQSLLISNKELEEYKEKCFIKLCITLNEG 

KSrLCDINEFSFELKPARLYVEDTFVYYIKTLF 

DTYLPNSRLAGHSTHLSGGKQVLPMQVTQH 

adai VNPVKLRKLVIOPVNLLVSIHASLKLY1 

ASDHTPLSFSVFERGPIFTTARQLVHALAMHY 

AAGALFRAGWWGSLDILGSPASLVRSIGNG 

V ADFFRLP YEGLTRGPGAFVSG VSRGTTSFVK 

mSKGTLTSITNLATSLARNMDRLSLDEEHYN 

RQEEWRRQLPESLGEGLRQGLSRLGISLLGAI 

AGIVDQPMQNFQKTSEAQASAGHKAKGVISG 
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Amino acid sequence (A=Alanme C=Cysteme, 
D=Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, G=Giycine, H=Histidine, 
I-Isoieucine, K=Lysine, L=Leucine, 
M=Methioninc, N=Asparaginc, P-Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V^Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 














"VGKGIMGVrTKPIGGAAELVSQTGYGILHGA 
GLSQLPKQRHQPSDWHADQAPNSHVKYVW 
KMLQSLGRPEVHMALDWLVRGSGQEHEGC 
LLLTSEVLFWSVSEDTQQQAFPVTEIDCAQD 
SKQNNLLTVQLKQPRVACDVEVDGVRERLSE 
QQYNRLVDYITKTSCHLAPSCSSMQ1PCPWA 
AEPPP STVKTYHYL VDPHFAQ VTLSKPTMVK 
NKALRKGFP 


827 


2177 


A 


6748 


2 


1662 


FVGAPRRGNPFGSPGNPGRHQGPCHRPRG IK 

ASGVSPTLWRPQAAATGLEMPSSGRAIXDSP 

LDSGSLTSLDSSVFCSEGEGEPLALGDCFTVN 

VGGSRFVLSQQALSCFPHTRLGKLAVWASY 

RRPGALAAVPSPLELCDDANPVDNEYFFDRS 

SQAFRYVLHYYRTGRLHVMEQLCALSFLQEI 

QYWGIDELSIDSCCRDRYFRRKELSETLDFKK 

DTEDQESQHESEQDFSQGPCPTVRQKLWNIL 

EKPGSSTAARIFG VISION GVSIEKMALMSAEL 

SWLDLQLLEDLEYVCIS^TTGEFVLRFLCVRD 

RCRFLRKVPNIIDLLAILPFY1TLLVESLSGVSQT 

TQEL\ENVGAHCPGCLRLLRAL\RMLKAWGR 

HSTGLRSLGMTITQCYEEVGLLLLFLSVG1SIF 

STVEYFAEQS1PDTTFTSVPCAWWWATTSMT 

TVGYGDIRPDTTTGKJVAFMCILSGILVLALPI 

AIINDRFSACYFTLKLKEAAVRQREALKKLTK 

NIATDSYISVNLRDVYARSIMEMLRLKGRER 

ASTRSSGGDDFWF 


828 


2178 


A 


6786 


5672 


1360 

i 


GTHPASSGPVPLPPAAVSAATREELGEPVPFV 

TASSGFQSMHSSNPKVRSSPSGNTQSSPKSKQ 

EVMVRPPTVMSPSGNPQLDSKFSNQGKQGGS 

ASQSQPSPCDSKSGGHTPKALPGPGGSMGLK 

NG AGNG AKGKGKRERS TSADSFDQRDPGTPN 

DDSDIKECNSADHIKSQDSQHTPHSMTPSNAT 

APRS STPPHGQTTATEPTPAQKTPAKVVYVFS 

TCMANKAAEAVXKGQVETIVSFH1QNISNNK 

TERSTAPLNTQISALRNDPKPLPQQPPAPANQ 

DQNS SQNTRLQPTPPIP APAPKPAAPPRPLDRE 

SPGVENKLIPSVGSPASSTPLPPDGTGPNSTPN 

NRAVTPVSQGSNSSSADPKAPPPPPVSSGEPPT 

LGENPDGLSQEQLEHRERSLQTLRDIQRMLFP 

DEKEFTGAQSGGPQQNPGVLDGPQKKPEGPI 

QAMMAQSQSLGKGPGPRTDVGAPFGPQGHR 

DVPFSPDEMVPPSMNSQSGTIGPDHLDHMTP 

EQIAWLKLQQEFYEEKRRKPEQVWQQCSLQ 

D\fMVHQHGPRGWRGPPPPYQMTPSEGWAP 

GGTEPFSIXjINMPHSLPPRGMAPHPNMPGSQ 

MRLPGFAGMINSEMEGPNVPNPASRPGLSGV 

SWPDDVPKIPDGRNFPPGQOIFSGPGRGERFP 

NPQGLSEEMFQQQLAEKQLGLPPGMAMEGIR 

PSMEMNRWGSQRHMEPGNNPIFPRIPVEGP 

LSPSRGDFPKGIPPQMGPGRELEFGMVPSGM 

KGDVNLNVNMGSNSQMIPQKMREAGAGPEE 

MLKLRPG G SDMLP AQQKMVPLPFGEHPQQE 

YGMGPRPFLPMSQGPGSNSGLRNLREPIGPDQ 

RTNSRLSHMr rLr-LlNr SoiNr 1 oi^lN l Arr v v^jvvj 

LGRKPLDISVAGSQVHSPGINPLKSPTMHQVQ 

SPMLGSPSGNLKSPQTPSQLAGMLAGPAAAA 

SIKSPPVLGSAAASPVHLKSPSLPAPSPGWTSS 

PEPPLQSPGIPPNHKAPLTMASPAMLGNVESG 

GPPPPTASQPASVNIPGXSLPSSTPYTMPPEPTL 

SQNPLSINf^MSRVMSKFArVfPSNSNPGYNHDAI 
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Amino acid sequence f A^Alanme C=Cysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
^Phenylalanine, OGlycine, H-Histidine, 
I=lsoleucinc, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P= Pro line, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaJine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, V*possible 
nucleotide insertion 














"KTVASSDDDSPPARSPNLPSKINNMPGMU1N 1 
QNPR1SGPNPWPMPTLSPMGMTQPLSHSNQ 
MPSPNAVGPNIPPHGVPMGPGLMSHNPIMGH 
GSQEPPNWPQGRMGFPQGFPPVQSPPQQVPFP 
HNGPSGGOGSFPGGMGFPGEGPLGRPSNLPQ 
SSADAALCKPGGPGGPDSFTVLGNSMPSVFT 
DPDLQEVIRPGATGIPEFDLSR1IPSEKPSQTLQ 
YFPRGEVPGRKQPQGPGPGFSHMQGMMGEQ 
APRMGLALPGMGGPGPVGTPDIPLGTAPSMP 
GHNPMRPPAFLQOGMMGPHHRMMSPAQST 
MPGQPTLMSNPAAAVGMIPGKDRGPAGLYT 
HPGPVGSPGMMMSMQGMMGPNNRTS 


829 


2179 


A 


6797 


433 


3 


"ASFFNFSICICKilLE VGPP VGHPAHDD VUUKH 
GPGGR/GSRSPRSLQCAPGGGRRSGCPAGSSP 
ASTCPPSPGGSGADRFGPSPPPPSREAAPTAG 
AAASSTSSGASCPPVPASSRWGVRSRTRSGSG 
GEREPRDRPSERPRLV 


830 


2180 


A 


6800 


3 


1911 


' LPERAFGPRTPRAPRRRRRRLLLSPPPRPFFHL 
DREPRAPGPWLCPSRAGTAQDPARIRERRGR 
VAGGAAGPAMELRARGWWLLCAAAALVAC 
ARGDPASKSRSCGEVRQIYGAKGFSSS\DVPQ 
AEISGEHLRJCPQGYTCCTSEMEENLANRSHA 
ELET ALRD S SR VLQ AML ATQLRSFDDHFQHL 
LNDSERTLQATFPGAFGELYTQNARAFRDLY 
SELRLYYRGANLHLEETLAEFWARLLERLFK 
QLHPQLLLPDDYLDCLGKQAEALRPF\GEAP\ 
RELRLRA1ARA\FVAAR\SFVQGLGVAS\DWR 
KVAQVPLG\PEC\SRAVTEAGSYC/ALHCVGVP 
GARPCPDYCRNVLKGCLANQADLDAEWRNL 
LDSMVL1TDKFWGTSGVESVIGSVHTWLAEA 
rMAT onNRDTLTAKVIQGCGNPKVNPQGPGP 
EEKKRRGKLAPRERPPSGTLEKLVSEAKAQL 
RDVQDFWISLPGTLCSEKMALSTASDDRCWN 
GMARGRYLPEVMGDGLANQINNPEVEVDIT 
KPDMTIRQQIMQLKIMTKRLRSAYNGNDVDF 
QDASDDGSGSGSGDGCLDDLCGRKVSRKSSS 
S RTPLTHALPGLS EQEGQKTS AASCPQPPTFL 
T .PI.LLFLALTVARPRWR 


831 


2181 


A 


6808 

i 


2 


1522 


ASRHGMTPGALLMLLGALGPPLAPGVRGSEA 

EGRLREKLFSGYDSSVRPAREVGDRVRVSVG 

L1L AQLI SLNEKDEEMSTKVYLDLEWTDYRLS 

\\nDPAEHDGIX)SLRITAESVWLPDVVLLN>rNI) 

GNFDVALDISVWSSDGSVRWQPPGIYRSSCS 

IQVTYFPFDWQNCTMVFSSYSYDSSEVSLQT 

GLGPDGQGHQEIHIHEGTFIENGQWENIHKPS 

RLIQPPGDPRGGREGQRQEVIFYLIIRRKPLFY 

LVNV1APCILITLLAIFVFYLPPDAGEKMGLSIF 

ALLTLTVFLLLLADKVPETSLSVPUIKYLMFT 

VfVLVTFSVILSVWLNLHHRSPHTHQMPLWV 

RQIFIHKLPLYLRLKRPKPERDLMPEPPHCSSP 

GSGWGRGTDEYFIRKPPSDFLFPKPNRFQPEL 

SAPDLRRFEDGPNRAVALLPELREWSSISY1A 

RQLQEQEDHDALKEDWQFVAMWDRLFLW 

r _- rc , \\/TFT HATYHI PPPDPFP 
Trill* 1 S VU 1 LW \rL,\Jf\ l iaLrrrurfx ^ 


832 


2182 


A 


6824 


71 


1079 


ETMAKNPPENCEDCHILNAEAFKSKKlC'KiiLl^ 
ICGLVFGIT.ALTLIVLFWGSKHFWPEVPKKAY 
DMEHTFYSNGEKKKJYMEEDPVTRTE1FRSGN 
GTDETLEVHDFKNGYTGIYFVGLQKCFDCTQI 
KV1PEFSEPEEEIDENEEJTTTFFEQSVIWVPAE 
KPIENRDFLKNSKILEICDNVTMYW^ 
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/=possible nucleotide deletion, \=possible 
nucleotide insertion 














"GTFAKQLHHNFAFIILVSELQDFEEEGEDLHFF 
ANEKKGIEQNEQWVVPQVKVEKTRHARQAS 
EEELPINDYTENGIEFDPMLDERGYCCIYCRR 
GNRYCRRVCEPLLGYYPYPYCYQGGRVICRV 
IMPCNWWVARMLGRV 


833 


2183 


A 


6846 


116 


602 


EAEGEQVCGAKCCGDAPHVENREEETAR1GP 

GVMESKEERALNNLIVENVNQENDEKDEKE 

QVANKGEPLALPLNVSEYCVPRGNRRRFRVR 

QPILQYRWDIMHRLGEPQARMREENMERJGE 

EVRQLMEKLREKQLSHSLRAVSTDPPHHDHH 

DEFOLMP 


834 


2184 


A 


6851 


3 


2024 


PNGVALLHLPGAAVIPNTNYMFQDALGGRSR 

GSREESPAPSRAPASASLWRRLWYEAKMAA 

HAAAAAQ AAAAQ AAHAEAAD S WYLALLGF 

AEHFRTSSPPKIRLCVHCLQAVFPFKPPQRIEA 

RTHLQLGSVLYHHTKNSEQARSHLEKAWLIS 

QQIPQFEDVKFEAASLLSELYCQENSVDAAKP 

LLRKAIQISQQTPYWHCRLLFQLAQLHTLEKD 

LVSACDLLGVGAEYARWGSEYTRALFLLSK 

GMLLLMERKLQEVHPLLTLCGQIVENWQGN 

P1QKESLRVFFLVLQVTHYLDAGQVKSVKPC 

LKQLQQCIQTI STLHDDEILPSNP ADLFH WLP 

KEHMCVLVTLVTVMHSMQAGYLEKAQKYT 

DKALMQLEKLKMLDCSPILSSFQVJDLLEHIIM 

CRLVTGHKATALQEISQVCQLCQQSPRLFSN 

HAAQLHTLLGLYCVSVNCMDNAEAQFTTAL 

RLTNHQELWAFIVTNLASVYIREGNRHQEVVX 

LYSLLERINPDHSFPVSSHCLRAAAFYVRGLF 

SFFQGRYNEAKRFLRETLKMSNAEDLNRLTA 

CSLVLLGHIFYVLGNHRESNNMVVPAMQLAS 

KIPDMS VQL WSS ALLRDLNKACGNAMD AHE 

AAQMHQNFSQQLLQDHrEACSLPEHNLITWT 

DGPPPVQFQAQNGPNTSLASLL 


835 


2185 


A 


6855 


334 


1268 


PTRRPILPLTSPKA1SVPSPLQGKQHTLVKSCL 

SVSGIGGFLVSLSSRMKLQTLAVSVTALKFWS 

AYVPCQTQDRDALRLTLEQ1DLIRRMCASYSE 

LEL VT S AKALNDTQKLACLIGVEGGHSLDNS 

LSILRTFYMLGVRYLTLTHTCNTPWAESSAK 

GVHSFYNNISGLTDFGEKWAEMNRLGMMV 

DLSHVSDAVARRALEVSQAPVLFSHSAARGV 

CNSARNVPDDILQLLEEERWAFVMVSLFHGE 

LIQWQPIRPMCSTVADHFDHIKAWGSKFIGI 

GGDYDGAGKYRKKTTOCAPWRTSSRMSS 


836 


2186 


A 


6862 


315 


11 


" PPRSRPSCWRKKVGPGRPWWWGGTGPPGQG 
RPEIRLLPLPMTGACGAVAASRTGSSGPG/SSL 
PNGHGGKGSGL ANGLAGNP\GHLGLGS SFGT 
GPGSGRPPP 


837 


2187 


A 


6863 


2 


1615 


VLRGQRGPAGGLAEERRRGRNEWRJHDV'IT 

APFPGLVQRRSRLL1VSQVRYFLKNKVSPDLC 

NEDGLTALHQCCIDNFEE1VKLLLSHGANVN 

AKDNELWTPLHAAATCGHINLVKILVQYGA 

DLLAVNSDGNMPYDLCEDEPTLDVTETCMAY 

rv^ rrn F T^TTsTF MR V A PEOOMIADIHCMIAAGQ 

DLDWIDAQGATLLH1AGANGYLRAAELLLDH 

GVRVDVKDWDGWEPLHAAAFWGQMQMAE 

LLVSHGAN\LNARTSMDEMPIDLCEEEEFKVL 

LLELK\HKHDVTMKSQLRHKSSLSRRTSHRQA 

S/SVGKVVRRTQPVGPGPNL\YRKEYE/GEEAJ 

LWQRSA\AEDQRTSTYNGDIRETVRTDQENKD 
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D=Aspartic Acid, E=Glutamic Acid, 
F-Phrnvlalanine GNjlvcine, H=Histidine, 
I=lsoleucine, K=Lysine, L=Leucine, 
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/=possible nucleotide deletion, \=possible 
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"PNPRLEKXP VLLSEFPTKJPRGELDMP VEN ULK 
APVSAYQYALANGDVWKVHEVPDYSMAYG 
NPGVADATPPWSSYKEQSPQTLLELKRQRAA 
AKLLSHPFLSTHLGSSMARTGESSSEGKAPL1 
GGRTSP Y S SNGTS VYYTVTSGDPPIXKFKAPI 
EEMEEKVHGCCRIS 


838 


2188 


A 


6865 


6291 


739 


AGPLEPRVQGAMALQLWALTLLGLLGACiAS 

LRPRKLDFFRSEKELNHLAVDEASGWYLGA 

VNALYQLDAKLQLEQQVATGPVLDNKKCTP 

PIEASQCHEAEMTDNVNQLLLVDPPRKRLVE 

CGQLLKG1\CALRALSNISLRLFYEDGSGEKSF 

VASNDEGVATVGLVSSTGPGGDRVLFVGKG 

NGPHDNGIIVSTRLLDRTD S REAFE A YTDHAT 

YKAGYLSTNTQQFVAAFEDGPYVFFVFNQQD 

KHPARNRTLLARMCREDPNYYSYLEMDLQC 

RDPDIHAAAFGTCLAASVAAPGSGRVLYAVF 

SRDSRSSGGPGAGLCLFPLDEVHAKMEANRN 

ACYTGTREARDIFYKPFHGDIQCGGHAPGSSK 

SFPCGSEHLPYPLGSRDGLRGTAVLQRGGLN 

LTAVTVAAENNHTVAFLGTSDGRILKVYLTP 

DGTSSEYDSDLVEINKRVKRDLVLSGDLGSLY 

AMTQDKVFRLPVQECLSYPTCTQCRDSQDPY 

CGWCVVEGRCTRKAECPRAEEASHWLWSRS 

KSCVAVTSAQPQNMSRRAQGEVQLTVSPLPA 

LSEEDELLCLFGESPPHPARVEGEAVICNSPSS 

IPVTPPGQDHVAVTIQLLLRRGNIFLTSYQYPF 

YDCRQAMSLEENLPCISCVSKRWTCQWDLR 

YHECREASPNPEDGIVRAHMEDSCPQFLGPSP 

LVIPMNHETDVNFQGKNLDTVKGSSLHVGSD 

LLICFMEPVTMQESGTFAFRTPKLSHDANETL 

PLHLYVKSYGKNIDSKLHVTLYDCSFGRSDC 

SLCRAANPDYRCAWCGGQSRCVYEALCNTT 

SECPPPVITRJQPETGPLGGGIRrnLGSNLGVQ 

AGDIQRISVAGRNCSFQPERYSVSTRIVCVIEA 

AETPFTGGVEVDVFGKLGRSPPNVQFTFQQP 

KPLSVEPQQGPQAGGTTLTtHGTHLDTGSQED 

VRVTLNGVPCKVTKFGAQLQCVTGPQATRG 

QMLLEVSYGGSPVPNPGIFFTYRENPVLRAFE 

PLRSFASGGRSINVTGQGFSLIQRFAMW1AEP 

LQSWQPPREAESLQPMTWGTDYVFHNDTK J 

WFLSPAVPEEPEAYNLTVLIEMDGHRALLRT 

EAGAFEYVPDPTFENFTGGVKKQVNKLIRAR 

GTNLNKAMTLQEAEAFVGAERCTMKTLTET 

DLYCEPPEVQPPPKRRQKRDTTHNLPEFIVKF 

GSREWVLGRVEYDTRVSDVPLSLILPLVIVPM 

VWIAVSVYCYWRKSQQAEREYEKIKSQLEG 

LEESVRDRCKJK^FTDLMIEMEDQTNDVHEAG 

1PVLDYKTYTDRVFFLPSKDGDKDVM1TGKL 

DIPEPRRPWEQALYQFSNLLNSKSFLINFIHT 

LVENQPEFSARAKVYFASLLTVALHGKLEYYT 

DIMHTLFLELLEQYVVAKNPKLMLRRSETW 

FRMLSNWMSICLYQYLKDSAGEPLYKLFKAl 

KHQVEKGPVDAVQKKAKYTLNDTGLLGDD 

VEYAPLTVSVrVQDEGVDAIPVKVLNCDTlSQ 

VKEKIIDQVYRGQPCSCWPRPDSWLEWRPG 

STAQILSDLDLTSQREGRN^TCRVNTLMHYNVR 

DGATLDLSKVGVSQQPEDSQQDLPGERHALL 

EEENRVWHLVRPTDEVDEGKSKRGSVKEKE 

RTKJUTEIYLTRLLSVKGTLQQFVDNFFQSVL 

APGHA VTP A VK YFFDFLDEQ AEKHNI QDEDTI 
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D=Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
Hsoieucine, K-Lysine, L=Leucine, 
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nucleotide insertion 














HI WKTN SLPLRF WVN 1LKNPHFEFD VH VHE W 
DASLSVTAQTFMDACTRTEHKLSRDSPSNKLL 
Y AKEI STYKXMVED YYKGIRQM VQ VSDQDM 
NTHLAEISRAHTDSLNTLVALHQLYQYTQKY 
YDE1INALEEDPAAQKMQLAFRLQQIAAALE 

TJkrvmT. 


839 


2189 


A 


6872 


1 


1485 


RARRLALQCHVCVCALTPGEQSGRRLPGQT 

WLMFSCFCFSLQDNSFSSTTVTECDEDPVSLH 

EDQTDC S SLRDENNKENYPD AG AL VEEHAPP 

SWEPQQQNVEATVLVDSVLRPSMGNFKSRKP 

KSIFKAESGRSHGESQETEHWSSQSECQVRA 

GTP AHESPQNN AFKCQET\VRL\QPR1DQRT AT 

SPKD AFETRNQDLNEEEAAQVHGVKDPAPAS 

TOSVLAXDGTDSADPSPVHKDGQNEADSAPE 

DLHSVGTSRLLLATOTDGDNPTAVRHGCS1VF 

SGOSQRFNLDPESAPSPPSTQQFMMPRSSSRC 

SCGDGKEPQTITQLTKMQSLKRK1RKFEEKFE 

OEKKYRPSHGDKTSKPEVIXWMNDLAKGRK 

OLKELKLKLSEEQGSAPKGPPRNLLCEQPTVP 

RENGKPEAAGPEPSSSGEETPDAALTCLKERR 

EQLPPQEDSKVTKQDKNLIKPLYDRYRIIKQIL 

STPS L1PT[VSQDTCMLLLCTDV 


840 


2190 


A 


6873 


2 


2054 


^TFRFYFSFIRLFAMSLAOLTKTNIDEHFFGVAL 
ENNRRSAACKRSPGTGDFSRNSNASNKSVDY 
SRSQCSCGSLSSQYDYSEDFLCDCSEKAINRN 
YLKQPVVKJEKEKKKYNVSKJSQSKGQKEISV 
EKKKTWNASLFNSQIHM1AQRRDAMAHRILS 
ARLHKIKGLKNELADMHHKLEAILTENQFLK 
OLQLRHLKAIGKYENSQNNLPQIMAKHQNEV 
KNLRQLLRKSQEKERTL SRKLRETD SQLLKT 
KDILO ALQKLSEDKNLAEREELTHKLSIITTK 
MDANDKKIQSLEKQLRLNCRAFSRQLAIETR 
KTLAAQTATKTLQVEVKHLQQKLKEKDREL 
EIKKIYSHRILKNLHDTEDYPKVSSTKSVQAD 
RKJLPFTSMRHQGTQKSDVPPLTnXGKKATG 
NIDHKEKSTEINHEIPHCVNKLPKQEDSKRKY 
EDLSGEEKHLEVQILLENTGRQKDKKEDQEK 
KNTP VKEEQELPPKUE VIHPERESNQEDVL V K 
EKFKRSMQRNGVDDTJUGKGTAPYTKGPLRQ 
RRHYSFTEATENLHHGLPASGGPANAGNMR 
YSHSTGKHLSNREEMELEHS\DSGYEPSFGKS 
SRIKVKDTTFRDKKSSLMEELFGSGYVLKTD 
Q S SPG V AKG SEEPLQ SKESHPLPPSQ A STSHA 
FnnJsKVTVVNSIKPSSPTEGKRKHI 


841 


2191 


A 


6874 


3 


" 2867 


" -sSRTRENffiEKEILRRQIRLLQGLlDU ¥ KTLHG 
NAPAPGTPAASGWQPPTYHS GRAFS ARYPRP 
SRRGYSSHHGPSWRKKYSLVNRPPGPSDPPA 
DHAVRPLHGARGGQPPVPQQHVLERQVQLS 
OGQNWIKVKPPSKSGSASASGAQRGSLEEFE 
DTPWSDQRPREGEGEPPRGQLQPSRPTRARG 
TCSVEDPLLVCQKEPGKPRMVKSVGSVGDSP 
REPRRTVSESVIAVKASFPSSALPPRTGVALG 
RKLGSHSVASCAPQLLGDRRVDAGHTDQPVP 
SGSVGGPARPASGPRQAREASLVVTCRTKKF 
RXNNYKWVAASSKSPRVARRALSPRVAAEN 
VCKASAGMANKVEKPQLIADPEPKPRKPATS 
SKPGSAPSKYKWKASSPSASSSSSFRWQSEAG 
SKDHASQLSPVLSRSPSGDXRPALAHSGLKPLS 
GETPLSAYKVKTRTKIIRRRGSTSLPGDKKSG 
TSPAATAKSHLSLRRRQALRGKSSPVLKKTPN 
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T=Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *-Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 














KGLVQVTKHRLCRLPPSRAHLPTKEAbbLHA 

VRTAPTSKVLKTRYRIVKKTPASPLSAPPFPLS 

LPSWRARRLSLSRSLVLNRLRPVASGOGKAQ 

PGSPWWRSKGYRCIGOVLYKVSANKLSKTSG 

OPSDAGSRPLLRTGRLDPAGSCSRSLASRAVQ 

RSLAIIRQARQRREKRKEYCMYYNRFGRCNR 

GERCPYIHDPEKVAVCTRFVRGTCKKTDGTC 

PFSHHVSKEKMPVCSYFLKGICSNSNCPYSHV 

YYSRKAEVCSDFLKGYCPLGAKCKKKHTLLC 

PDFARRGACPRGAQCQLLHRTQKRHSRRAAT 

SPAPGPSDATARSRVSASHGPRKPSASQRPTR 

QTPSSAALTAAAVAAPPHCPGGSASPSSSKAS 

SSSSSSSSPPASLDHEXAPSLQEAALAAACSNR 

LCKLPSFISLQSSPSPGAQPRVRAPRAPLTKDS 

GKP1 .HTKPRL 


842 


2192 


A 


6898 


506 


2071 


WPDLVHTWSSEEAMGSCCSCPDKD TVFUNH 

RNKFKVINVDDDGNELGSGIMELTDTEL1LYT 

RKRDSVKWHYLCLRRYGYDSNLFSFESGRRC 

OTGQGIFAFKCARAEELFNMLQEIMQNNSIN 

WEEPVVERNNHQTELEVPRTPRTPTTPGFAA 

ONLPNGYPRYPSFGDASSHPSSRHPSVGSARL 

PSVGEESTHPLLVAEEQVHTYVNTTGVQEER 

KNRTSVHVPLEARVSNAESSTPKEEPSSIEDR 

DPOILLEPEGVKFVLGPTPVQKQLMEKEKLE 

m r p no V<sCtSG ANNTE WDTGYDSDERRDAP 

SVNKLVYENINGLSIPSASGVRRGRLTSTSTSD 

TONINNSAQRRTALLNYENLPSLPPVWEARK 

LSRDEDDNLGPKTPSLNGYHNNLDPMHNYV 

NTENVTVPASAHKIEYSRRRDCTPTVFNFDIR 

RPSLEHRQLNYIQVDLEGGSDSDNPQTPKTPT 

TPLPQTPTRRTELYAVIDIERTAAMSNLQKAL 

pp nnr;TSR\KTRHNSTuOLPL 


843 


2193 


A 


6919 


2 


663 


AGRPGTTHASGKMAYQSLRLEYLQ1PP V SKA 

YTTACVLTTAAVQLELITPFQLYFNPELIFKHF 

OIWRLITTMFLFFGPVGFNFOTMMIFLYRYCRM 

LEEGSFRGRTADFVFMFLFGGFLMTLFGLFVS 

1 7VFI GPGLYNN/GSSMCGAFAEPLCPHELLRP 

SOLPGPLSALGAHGIFLWGELNHCGPFGYCS 

WTHTFFLGRCISOSTWWNKNSENTIYFESYF 


844 


2194 


A 


6928 


902 


366 


" HRLCMPIQGACGERME/FSLLLPGLKCJNLiViL 
AHCNLRLPGSSNSPASASQVAGITGVCHHAR 
LIFVFSVETGFLHAGQAGLELLTSGDPPASAS 
QSAGfTGKSQHTRPGYEFnPYSAAQEDALKA 

1 M 


845 


2195 


A 


6939 


f660 


317 


' LYPENLGESLFPILLLPPPWPDGGRPCCVbMi 
TRAKKLRRIWRJLEEKESVAGAVQTLLLRSQE 
GGV\TSAAASTLSEPPRRTQESRTRTRALGLPT 
LPMEKLAASTEPQGPRPVLGRESVQVPDDQD 
FRSFRSECEAEVGWNLTYSRAGVSVWVQAV 
EMDRTLHKIKCRMECCDVPAETLYDVLHD1E 
YRjCKWDSNVffiTHDIARLTVNADVGYYSWR 
CPK^LKNTlDVirLRSV^LPMGADYinvlNYSVK 

HPKYPPRKDLVKAV aiy i u i i/iyo lurrwuw ▼ 

YLAQVDPKGSLPKWVVNKSSQFLAPKAMKK 

MYKACLKYPEWKQKHL\PHFKPWL\HPEQSP 

LPSLALS\ELSVQHADS\LEN1DESAV\AESREE 

RVMGGAGGEG\SDDDTSLYAEAPHRFRETETG 

PGAGRALGAAAAPALSPLHPPGTWWHRARP 

RRVLQPGWTEPQ 
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Arruno acid sequence (A=Alanine C=Cysteine, 
D^Aspartic Acid, E-Glutamic Acid, 
F-Phenylalanine, (Xjlycine, H=Histidine, 
I=Isoleucine, lt=Lysinc, L=Lcucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
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Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 


846 


2196 


A 


6944 


42 


2672 


RRKMAGCRGSLCCCCRWCCCCGERETRTPE 

ELTILGETQEEEDEILPRKDYESLDYDRCINDP 

YLEVLETMDNKKGRRYEAVKWMVVFA1GV 

CTGLVGLFVDFFVRLFTQLKFGWQTSVEECS 

QKGCLALSLLELLGFNLTFVFLESLLGLIEPVE 

AGSGITEGKCYLYARQVPGLVRLPTLLWKAL 

GVLLTVAAMLLI\GLGSPMIHSGSVVGAGLPQ 

FQSISLRK1QFNFPYFRSDRYGKVDKRDFVSAG 

AAAGVAAAFGAPIGGTLFSLEEGSSFWNQGL 

TWKVLFCSMSATFTLNFFRSGIQFGSWGSFQL 

PGLLNFGEFKCSDSDKKCHLWTAMDLGFFV 

VMGVIGGLLGATFNCLNKRLAKYRMRNVHP 

KPKXWVLESLLVSLVTTVVVFVASMVLGEC 

RQMSSSSQIGNDSFQLQVTEDVNSSIKTFFCP 

NDTYNDMATLFFNPQESAILQLFHQDGTFSPV 

TLALFFVLYFLLACWTYGISVPSGLFVPSLLC 

GAAFGRLVANVLKSYIGLGHIYSGTFALIGAA 

AFLGGVVRMTISLTVILIESTANEITYGLPIMVT 

LMVGKWTGDFFNKGIXYD1HVGLRGVPLLEW 

ETEVEMDKLRASDIMEPNLTYVYPHTRIQSLV 

SILRTTVHHAFPVVTENRGNEKEFMXGNQLIS 

NKIKFKKSSILTRAGEQRKRSQSMKSYPSSEL 

RNMCDEHIASEEPAEKEDLLQQMLERRYTPY 

PNLYPDQSPSEDWTMEERFRPLTFHGLILRSQ 

LVTLLVRGVCYSESQSSASQPRLSYAEMAED 

YPRYPDIHDLDLTLLNPRMIVDVTPYMNPSPF 

TVSPNTHVSQVFNLFRTMGLRHLPVVNAVGE 

IVGIITRHNLTYEFLQARLRQHYQTI _ 


847 


2197 


A 


6951 


3 


1994 


NTNSSSVTNSAAGVEDLNIVQVTVPDNEKJiR 

LSSIEKIKQLREQVNDLFSRKFGEAIGVDFPVK 

VPYRKITFNPGCVVIDGMPPGVVFKAPGYLEI 

S SMRRILEAAEFIKFT VIRPLPGLELSNGE YST 

VGKRKIDQEGRVFQEKWERAYFFVEVQNIST 

CLICKRSMSVSKEYNLRRHYQTNHSKHYDQY 

MERMRDEKLHELKKGLRKYLLGLSDTECPE 

QKQVFANPSPTQKSPVQPVEDLAGNLWEKLR 

EKIRSFVAYSIAIDEITDINNTTQLAIFIRGVDE 

NFDVSEELLDTVPMTGTKSGNEIFSRVEKSLK 

NFCINWSKLVSVASTGTPPMVDANKGLVTKL 

KSRVATFCKGAELKSICCDHPESLCAQ\KLKM 

DHVMDVVVKSVNWICSRGLNHSEFTTLLYEL 

DSQYGSLLYYTEIKWLSRGLVLKRFFESLEEI 

DSFMSSRGKPLJXJLSSIDWIRDLAFLVDMTM 

HLNALNISLQGHSQIVTQMYDLIRAFLAKLCL 

W^ETHLTRNNLAHFPTLKLVSRNESDGLNYIP 

KIAELKTEFQKRLSDFKLYESELTLFSSPFSTK1 

DSVHEELQMEVIDLQCNTVLKTKYDKVGIPE 

FYKYLWGSYPKYKHHCAKILSMFGSTYICEQ 

LFSIMKLSKTKYCSQLKDSQWDSVLH1AT 


848 


2198 


A 


6985 


3 


289 


" SVQYLPGRPTRTHASTDAPLNfLKFTPLPSKTK 
ASAPVQCLLLMAATFSPQGLAKPHSGTIPmC 
CFNAINTKIPIQRLESYTRITNIQCPKEAVM 


849 


2199 


A 


6999 


963 


5 


" " LDFLCHRDMGDNITSim'LLLGFPVGPKJQM 
LLFGLFSLFYVr 1 LIajNU 1 U^Ui^ioLiJoivA-*rLnj- 
MYFFLSHLVAVVDIAYACOT^RMLVNLLHP 
AKPISFAGRMMQTFLFSTFAVTECLLLWMS 
YDLYVWCHPLRYLAIMTWRVCITLAVTSWT 
TGVLLSLIHLVLLLPLPFCRPQKJYHFFCEILA 
VLKLACADTHINENMVLAGAISGLVGPLSTTV 
VSYMCILCAILQIQSREVQRKAFCTCFSHLCVI 
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D=Aspartic Acid, E=Glutamic Acid, 
F-Phenylalanine, 0=GlyciDe, H=Histidine, 
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T=Threonine, V=Va]ine, W=Tryptophan, 
Y=Tyrosine, X^Unknown, *=Stop codon, 
/^possible nucleotide deletion, \-possible 
nucleotide insertion 














GLFYGTAHMYVGPRYGNPKEQKKYLLLFHS 
LFNPMLNPLICSLRNSEVKNTLKRVLGVERAL 


850 


2200 


A 


7001 


1 


1011 


MGNDSVSYEYGDYSDLSDRPVDCLDGACLAI 

DPLRVAPLPLYAAIFLVGVPGNAMVAWVAG 

KVARRRVGATWLLHLAVADLLCCLSLPILAV 

PIARGGHWPYGAVGCRALPSIILLTMYASVLL 

LAAJLSADLCFLALGPAWXCLRES/GACGVQVA 

CGAAWTI^LLTVPSAIYRRLHQEHFPARLQ 

CWDYGGSSSTENAVTAIRFLFGFLGPLVAVA 

SCHSALLCWAARRCRPLGTAIWGFFVCWAP 

YHLLGLVLTVAAPNSAIXARALRAEPLIVGL 

ALAHSCLNPMLFLYFGRAQLRRSLPAACHW 

ALRESQGQDESVDSKKSTSHDLVSEMEV 


851 


2201 


A 


7011 


1 


2310 


AAASPLRMSRKGPRAEVCADCSAPDPGWASI 

SRGVLVCDECCSVHRSLGRHISIVKHLRHSA 

WPPTLLQMVHTLASNGANSIWEHSLLDPAQV 

QSGPALKQTPKDKVXHPIKSEFIRAKYQMLAF 

VHKLPCRDDDGVTAKDLSKQLHSSVRTGNLE 

TCLRLLSLGAQANFFHPEKGTTPLHVAAKAG 

QTLQ AELL WYG ADPG SPD VNGRTPIDY ARQ 

AGHHELAERLVECQYELTDRLAFYLCGRKPD 

HKNGHYIII^MADSLDLSELAKAAKXKJLQAL 

SNRLFEELAMDVYDEVDRRENDAVWLATQN 

HSTLVTERSAVPFLPVNPEYSATRNQGRQKL 

ARFNAREFATLIIDILSEAKRRQQGKSLSSPTD 

NLELSLRSQSDLDDQHDYDSVASDEDTDQEP 

LRSTGATRSNRARSMDSSDLSDGAVTLQEYL 

ELKKALATSEAKVQQLMKVNSSLSDELRRLQ 

REIHKLQAENLQLRQPPGPVPTPPLPSERAEH 

TPMAPGGSTHRRDRQAFSMYEPGSALKPFGG 

PPGDELTTRLQPFHSTELEDDAIYSVHVPAGL 

YRIRKGVSASAVPFTPSSPLLSCSQEGSRHTSK 

LSRHGSGADSDYENTQSGDPLLGLEGKKFLE 

LGKEEDFHPELESLDGDLDPGLPSTEDVILKT 

EQVTKNIQELLRAAQEFKHDSFVPCSEKIHLA 

VTEMASLFPKRPALEPVRSSLRLLNASAYRLQ 

SECRKTVPPEPGAPVDFQLLTQQVIQCAYD1A 

KAAKQLVTT1TREKXQ 


&52 


2202 


A 


7016 


484 


1777 


riskiqvyystgyssrkmnptlglaiflavll 

tvkgllkpsfsprnykalsevqgwkqrmaa 

kelarqnmdlgfkllkklafynpgrniflsp 

ls i stafsmlclg aqdstldeikqgfnfrkmp 

ekdlhegfhyiiheltqktqdlklsigntlfid 

qrlqpqrkfledaknfysaetiltnfqnlem 

aqkqindfi/eskthgkinnlienidpgtvmll 

anyiffrarwkhefdpnvtkeedffleknss 

vkvpmmfrsgiyqvgyddklsctileipyqk: 

nitaifilpdegklfchlekglqvdtfsrwktl 

lsrrwdvsvprlhmtgtfdlkktlsyigvs 

kjfeehgdltkjaphrslkvgeavnkaelkm 

DERGTEGAAGTGAQTLPMETPLVVKIDKPYL 
LLIYSEKIPSVLFLGKIVNPIGK 


853 


2203 


A 


7017 


1 


3293 


MTHACNPSTLGGQGRR1TRSHGRRRSSRGPV 

ARHVAAGAGHENKHGGSRRFPAGVAPRRAM 

ANVSKJCVS WSGRDRDDEEAAPLLRRTARPG 

GGTPLLNGAGPGAARQSPRSALFRVGHMSSV 

ELDDELLEFADMDPPHPFPKEIPHNEKLLSLKY 

ESLDYI)NSENQLFLEEERRINHTAFRTVEIKR 

M^nCALIGILTGLVACFIDIWENLAGLKYRVI 

KGSILPNIDKFTEKGGLSFSLLLWATLNAAFV 
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D=Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, OOlycine, H=Histidine, 
I«Iso leucine, K=Lysine, L-Leucine, 
M=Mcthioninc, N=Asparaginc, P=Proline, 
Q=G lutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 














LVGSVlVAFIEPVAAGSGIPQIKCFLNGVKiPH 

WRLKTLVIKVSGVILSWGGLAVGKEGPMI 

HSGSVIAAGISQGRSTSLKRDFKJFEYFRRDTE 

KRDFVSAGAAAGVSAAFGAPVGGVLFSLEEG 

ASFNWQFLTWRIFFASMlSTTTLNFVLSiYHG 

NM WDLS SPGLINFGRTOSEKMA YTIHEIPVFI 

AMGWGGVLGAVFNALNYWLTMFRIRYIHR 

PCLQVIEAVLVAAVTATVAFVLIYSSRDCQPL 

QGGSMSYPLQLFCADGEYNSMAAAFFNTPEK 

SWSLFHDPPGSYNPLTLGLFTLVYFFLACWT 

YGLTVSAGVFIPSLLIGAAWGRLFGISLSYLTG 

AAIWADPGKYALMGAAAQLGGIVRMTLSLT 

V1MMEATSNVTYGFPIMLVLMTAKIV GDVFIE 

GLYDMHIQLQSVPFLHWEAPVTSHSLTAREV 

MSTPVTCLRRREKVGVIVDVLSDTASNHNGF 

PWEHADDTQPARLQGLELRSQLIVLLKHKVF 

VERSNLGLVQRRLRLKDFRDAYPRFPPIQSIH 

VSQDERECTMDLSEFMNPSPYTVPQEASLPR 

VFKLFRALGLRHLVWDNRNQWGLVTR1CD 

LARYRLGKRGLEELSLAQTGPKAQATAEGRV 

AG AAQQPCQLRAVTLEDLGLLLAGGLAS PEP 

LSLEELSERYESSHPTSTASVPEQDTAKHWNQ 

LEQWWELQAEVACLREHKQRCERATRSLL 

RE1XQVRARVQLQGSELRQLQQEARPAAQAP 

EKEAPEFSGLQNQMQALDKRLVEVREALTRL 

RRRQVQQEAERRGAEQEAGLRLAKLTDLLQ 

QEEQGREVACGALQKNQEDSSRRVDLEVAR 

M 


854 


2204 


A 


7037 


139 


2604 


AGTWEPRPYDQAKETGAPGSQPPVPPMELRP 

WLLWVVAATGTLVLLAADAQGQKVFTNTW 

AVRIPGGPAVANSVARKHGFLNLGQIFGDYY 

HFWHRGVTKRSLSPHRPRHSRLQREPQVQWL 

EQQVAKRRTKRDVYQEPTDPKFPQQWYL\SG 

VTQ\RDLMVKAAWAQGYTGHGIVVSILDDGI 

EKNHPDLAGNYDPGASFDVNDQDPDPQPRY 

TQMKDNRHGTRCAGEVAAVANNGVCGVGV 

AYNARIGGVRMLDGEVTDAVEARSLGLNPN 

HIHIYSASWGPEDDGKTVDGPARLAEEAFFR 

GVSQGRGGLGSIFVWASGNGGREHDSCNCD 

GYTNSIYTLSISSATQFGNVPWYSEACSSTLA 

TTYSSGNQNEKQIVTTDLRQKCTESHTGTSAS 

APIJ^GIIALTLEANKNLTWRDMQHLVVQTS 

KPAHLNANDWATNGVGRKVSHSYGYGLLD 

AGAMVALAQNWTTVAPQRKCIIDILTEPKDI 

GKRLEVRKTVTACLGEPNHITRLEHAQARLT 

LSYNRRGDLAIHLVSPMGTRSTLLAARPHDY 

SADGFNDWAFMTTHSWDEDPSGEWVLEIEN 

TSEANNYGTLTKFTLVLYGTAPEGLPVPPESS 

GCKTLTSSQACWCEEGFSLHQKSCVQHCPP 

GFAPQVLDTHYSTENDVETIRASVCAPCHAS 

CATCQGPALTDCLSCPSHASLDPVEQTCSRQS 

QSSRESPPQQQPPRLPPEVEAGQRLRAGLLPS 

fiLPEVYAGLSCAFIVLVFVTVFLVLQLRSGFS 

FRG VKVYTMDRGLI SYKGLPPEAWQEECPSD 

SEEDEGRGERTAFDCDQSAL 


855 


2205 


A 


7058 


3 


1441 


QRPASQLLAPFAAEALPGAPRAAMAQHFSLA 

ACDVVGFDLDHTLCRYNLPESAPLIYNSFAQF 

LVKEKGYDKELLNVTPEDWDFCCKGLALDL 

EDGNFLKLANNGTVTRASHGTKMMTPEVLA 

EAYGKKEWKHFLSDTGMACRSGKYYFYDN 
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CnTno acid sequence (A=Alamne C=Cysteme, 
>Aspartic Acid, E=Glutamic Acid, 
^Phenylalanine, G=Glycine, H=Histidine, 
=lsoleucine, K~Lysine, L=Leucine, 
U=Methionine, N=Asparagine, P^Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W-Tryptophan, 
Y^Tyrosine, X=Unknown, *-Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 














YFDLPGALLCAk V V D V LTKLNNUgJU rDFW 

KDIVAAIQHNYiCMSAFKENCGIYFPEIKRDPG 

RYLHSRPESVKKWLRQLKNAGKILLLITSSHS 

DYCRLLCAWDLGNDFTDLFDIVITNALKPGFF 

SHLPSQRPFRTLENDEEQEALPSLDKPGWYSQ 

GN A VHL YELLKKMTGKPEPK W YFGD SMKS 

DIFPARHYSNWETVLILEELRGDEGTRSQRPE 

ESEPLEKKGKYEGPKAKPLNTSSKKWGSFFM 

DSVLGLENTEDSLVYTWSCKRISTYSTIAIPSI 

EA1AELPLDYKFTRFSSSNSKTAGYYPNPPLV 

1 SSP^ .TSK 


856 


2206 


A 


7082 


396 


1635 


SSPSVFEFEHAVQPVFrMEFLKTC VLKKNAU i 

AVCFWRSKVVQKPSVRR1STTSPRSTVMPAW 

V1DKYGKNEVLRFTQNMMMP11HYPNEVIVK 

VHAASVNPIDVNMRSGYGATALNMKRDPLH 

VKJKGEEFPLTLGRDVSGVVMECGLDVKYFK 

PGDEVWAAVPPWKQGTLSEFWVSGNEVSH 

KPKSLTHTQAASLPYVALTAWSAINKVGGLN 

DKNCTGKRVL1LGASGGVGTFAIQVMKAWD 

AHVTAVCSQDASELVRKLGADDVIDYKSGSV 

EEQLKSLKPFDFILDNVGGSTETWAPDFLKK 

WSGATYVTLVTPFLLNMDRLG1ADGMLQTG 

VTVGSKALKHFWKGVHYRWAFFMASGPCL 

DDIAELVDAGKIRPVXIEQTFPFSKVPEAFLKV 

ERGHARGKTVINVV 


857 


2207 


A 


7088 


320 


2417 


LRRRKMTPQSLLQTTLFLLSLLFLVQGAHUK 

GHREDFRFCSQRNQTHRSSLHYKPTPDLRISIE 

NSEEALTVHAPFPAAHPASRSFPDPRGLYHFC 

LYWNRHAGRLHLLYGKRDFLLSDKASSLLCF 

QHQEESLAQGPPLLATSVTSWWSPQNISLPSA 

ASFTFSFHSPPHTGAHNASVDMCELKRDLQL 

LSOFLKHPQKASRRPSAAPASQQLQSLESKLT 

SVRFMGDMGSFEEDRINATVWKLQPTAGLQ 

DLHIHSRQEEEQSEIMEYSVLLPRTLFQRTKG 

RSGEAEKRLLLVDFSSQALFQDKNSSQVLGE 

KVLGIWQNTKVANLTEPWLTFQHQLQPKN 

VTLQCVFWVEDPTLSSPGHWSSAGCETVRRE 

TOTSCFCbmOLTYFAVLMVSSVEVDAVHKHY 

LSLLSYVGCWSALACLVTIAAYLCSRVPLPC 

RRKPRDYT1KVHMNLLLAVFLLDTSFLLSEPV 

ALTG SEAGCRAS A1FLHFSLLTCLS WMGLEG 

YNLYRLVVEVFGTYVPGYLLKLSAMGWGFPI 

FLVTLVALVDVDNYGPIILAVHRTPEGVrYPS 

MCWIRDSLVSYITNLGLFSLVFLFNMAMLAT 

MVVQ1LRLRPHTQKWSHVLTIXCLSLVLG\LP 

WALtFFSFASGTFQLVVLYLFSITTSFQGFLIFI 

WYWSMRLQARGGPSPLKSNSDSARLPISSGS 

TSSSR] 


858 


2208 


A 


7091 


185 


415 


DAGAVKSSDTNIWFRGMCDDKKGHKC^'U 

QPQHFHVAFHTEAEGAMFYFRLHVIHRVMQS 

OOOLFPSTLFSWLLE 


859 


2209 


A 


7136 


3 


302 


FFFWRQSLALLPRLECSGATGAHCNLHhFU^^ 
nrPT^A^MAGITGACYHAWLLFVFLAETGFH 
HVGQGGLELLTSSDPSGSASQSAGITGVSHCT 

WP1 


860 


2210 


A 


7156 


23 


591 


ALSTETRTPDMRRLLLVl'SLVVVLL WhAU A v 
P APK VPIKMQVKHWPSEQDPEKAWGARWE 
PPEKDDQLWLFPVQKPKLLTTEEKPRGQGR 
GP1LPGTKAWMETEDTLGRVLSPEPDHDSLY 
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T==Threonine, V- Valine, W-Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 














HPPPEEDQGEERPRLVATvIPNHQVLLGPbfcDg 
DHXYHPQ*GSRGHHCPRPVPRPRLLGLGPSLP 

CPS 


861 


2211 


A 


7161 


1220 


1003 


NYVCT1AF*EKKMGF*LSLSCLVLLFVLFLDCI 
LTTTTRIMFHCTYLFASVCLSLLNTLL SPNCL 


S62 


2212 


A 


7211 


665 


847 i 


LKYYHlTMGlYKTGKKVlL'KSSMSNRFSVif 
YKNIOKLSFSNYVYHQNYVFS SD WSYDF 


863 


2213 


A 


7212 


924 


1273 


HGSSC ALGDLAPG*LPSGP V LSSPA VRL* RKF 
LVWDSPSCLP ATGPT* GL VLVLGGPDCT* W A 
RGQHEHKRMRAP* SCR VTVNL AKKKKKTDQ 
rnCPNYOSPPKECDYNILANSVA 


864 


2214 


A 


7214 


845 


1619 


"SDKGGKKADRKNHLRHAPPLLPHRVRbKl.H 
DPKVPVDADHVQGQDPGRAAHDIHGEDVTE 
KVSKDPLAPDEVGDTDEGHDRHGHREVGQR 
HGHDQEEVAYEERACEGGKFATVEVTDKPV 
DEALREAMPKVAKY AGGTNDKGI GMGMTV 
PISFAVFPNEDGSLQKKLKVWFR1PNQFQSDP 
P APSDKS VKIEEREGITVYSMQFGG YAKEAD 
YVAQATRLRAALEGTATYRGDIYFCTGYDPP 
MKPYGRRNEIWLLKT 


865 


2215 


A 


7246 


559 


682 


" RRLGAVAHAYTSSTLGGRGGWIT* GQELQ 1 S 
I ANMA^PR^Y 


866 


2216 


A 


7257 


641 




TCT YKYLMGWIRGRRSRHS WEMSEFHN Y NL 

DLKKSDFSTRWQKQRCPWKSKCRENASPFF 

FCCFIAVAMGIRFIIMVAIWSAVFLNSLFNQEV 

QIPLTESYCGPCPKNW1CYKNNCYQFFDESKN 

WYESQASCMSQNASLLKVYSKEDQDLLKLV 

KSYHWMGLVHIPTNGSWQWEDGSILSPNLLT 

HEMQKGDCALYASSFKGYIENCSTPNTYICM 


867 


2217 


A 


7288 


151 


396 


T SIKHEAFGSNGPDF WFFRYWSP 4 LFRQQ V V Fl 
MPFFQTLWLMNANRFCSlFTTTKVA>n<CWW 

TPYHCWLSVWCRCESHGI 


868 


2218 


A 


7298 


3 


272 


iTDTVIGGRGSGGKEFGRWVLW* VFE*RLU ir 
KGSCPAGGSRMVSESD*EGRGC*ASYPCAC* 
1 AGS* WR* GSRPAGRGTPPRSLSHARPP 


869 


2219 


A 


7332 


1223 


332 . 


IPRRD AEDRDESCLNP AFP1GLLHPNS V NbMAK 
FLTECTWLLLLGPGLLATVRAECSQDCATCS 
YRLVRPADINFLACVMECEGKLPSLK1WETC 
KELLQLSKPELPQDGTSTLREN SKPEESHLLA 
KRYGGFMKRYGGFMKKMDELYPNfEPEEEA 
NGSEILAKRYGGFMKKDAEEDDSLANSSDLL 
KELLETGDNRERSHHQDGSDNEEEVSKRYGG 
FMRGLKRSPQLKEKAKELQKRYGGFMRRVG 
PQKW*MTSPQNRYGGFLKRFAEALPSDEEGE 
SYSKEVPEMEKRYGGFMRF 


870 


2220 


A 


7382 


216 


1018 


T^niQRLTERTQFLDESKXNPNS*QANLLRGGO 
AGQGRGREGAESGGSRGEGPGSDGRLPATGD 
FWSPRSQRRGCCGRRAPRPEAMENGAVYSPT 
TEEDPGPARGPRSGLAAYFFMGRLPLLRRVL 
KGLQLLLSLLAFICEEWSQCTLCGGLYFFEF 
V S CS AFLLSLLILI VYCTPFYER VDTTK VKS SD 
FY1TLGTGCVF1XASIIFVSTHDRTSAEIAATVF 
GFIASFMFLLDFITMLYEKRQESQLRKPENTT 

I RAEALTEPLNA 


871 


2221 

i 


A 


7403 


3 


393 


TSCAMCSGLL*lXLPlWLSWTLGTRUbbFKbviN 
DPGNMSFVKETVDKLLTGFRCFREREAAPRR 
ALRGAALPGESEAGDPESLRSSVNADWIQYS 
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nucleotide 
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wo last amino 
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Suno acid sequence"(A=Alanine OCystcme, 
D=-Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, G=Glycine, H-Histidine, 
J=Isoleucine, K=Lysine, L=Leucine, 
M-Mcthionine, N=Asparagine, P=Prolinc, 
Q=Glutamine, R=Arginine, S=Scrinc, 
T=Threonine, V=V aline, W=Tryptophan, 
Y-Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 














DLWEAEVSTPRCEAGFCQEChRTPGNQEKDCi 


872 


1111 




7413 


1061 


359 


F VDIVS WEFPHCPEARFPAQHGQDSl^KL 1 LC 

PGG S * PQ ATLHLDRMRVS ASF1 KJblU v kk i k 
CGLIKPCPANYFAFK1CSGAANVVGPTMCFED 

RMIMSPVKNKVGRGLNIALVNO 1 lUAVLAJV 

KAFDMYSGDVMHLVKFLKEIPGGALVLVAS 

YDDPGTKMNDESRKLFSDLGSSYAKQLGFRD 

SWVFIGAKDLRGKSPFEQFLKEQPQTQNKYE 

GWPELLEMEGCMPPKPF 


873 


2223 


A 


7429 


2242 


2394 


ILKCAGHGGSCL* SQHFGRLRWEDRLRLGVQ 
nHPnOHCETPSLLKIERKLF 


874 


2224 


A 

A 


7468 


146 


894 


TCTSCVLWATLHLFASTRKAPQAKCUM1S1 1 h 
WQKJtGVGITGFGlFFILFGTLLYFDSVLLAFGN 
LLFLTGLSLUGLRKTFWFFFQRHKLKGTSFLL 
GGVVIVLLRWPLLGMFLETYGFFSLFKGFFPV 
AFGFLGNVCNIPFLGALFRRLQGTSSMV*KTE 
MSSLNLDHWLKGAKREEWEPPPQSPALTHbr 
TYPGPPQVQKERNGAEQLTSNPQVDSRGCQE 
ARMOTPRRLGWGWYHTLTLYLWEEK 


875 


2225 


A 


7498 


91 


251 ~l 


"gekpvptwlqdeagqwllgfvaqpwgw^u 

SERHEP*HGGVLFRLGPSAPPGKL 


876 


2226 


A 


7544 


403 


587 


" YSCLCFLFKMTSFKNSVHIVVLGTVVHAYNFW 
II GGOGGWIA*GQEFKTSLGNTVRPCLYK 


877 


2227 


A 


7566 


2 


940 


GCAPDTRFFVPEPGGRGAAPWVALVARGGC 

TFKDKVLVAARRNASAWLYNEERYGNITLP 

MSHAGTGNTWIMISYPKGREILELVQKGIPV 

TMTIGVGTRHVQEFISGQSWFVAIAFITMMII 

SLAWLIFYY1QRFLYTGSQIGSQSHRKETKKV1 

GQLLLHTVKHGEKGIDVDAENCAVC1ENFKV 

KDIIRILPCKHIFHRICrX)PWLLDHRTCPMCKL 

DVIKALGYWGEPGDVQEMPAPESPPGRDPAA 

NLSLALPDDDGSDESSPPSASPAESEPQCDPSF 

KfiDAGENTALLEAORSDSRHGGPIS 


878 


2228 


A 


7586 


315 


1232 
391 


" ERSLLCKVDVRWIYVSEGTKTQRRHK^UbLR 
RGRMQAACWYVLFLLQPTVYLVTCANLTNG 
GKSELLKSGSSKSTLKHIWTESSKDLSISRLLS 
OTFRGKENDTDLDLRYDTPEPYSEQDLWDW 
LRNSTDLQEPRPRAKRRPIVKTGKFKKMFGW 
GDFHSNIKTVKLNLLITGKTVDHGNGTFSVYF 
RHNSTGQGNVSVSLVPPTKIVEFDLAQQTVID 
AKDSKSFNCRIEY^KVDKATKNTLCNYDPSK 
TCYQEQTQSHVSWLCSKPFKVICrYlSFYSTD 
YKLVOKVCPDYNYHSDTPYFPSG 

' TFQWVTXWWSPTCLDOLNGSAPGNVMHO 


<f*70 

o fy 
880 


111Q 

2230 


A 
A 


7605 
7612 

i 


479 
93 


659 


DAAVAMTA(^GLVANRGRKFKwAlt,Laorvj 
GGSRGRSDRGSGQGDSLYPVGYLDKQVPDTS 
VQETDRILVEKRCWD1ALGPLKQIPMNLFIMY 
MAGKTISIFPT^VCMMAWRPIQALMAISAT 
FKMLESSSQKFLQGLVYLIGNLMGLALAVYK 
CQSMGLLPTHASDWLAFEPPERMEFSGGGL 

IT- 


881 


2231 


A 


7615 




14*57 


-spqkt^^ 

n1isdqppqnfsatp>t\ttcpmdekllstvltt 
sysvifrvglvgnilalyvflgihrxrnsiqiyl 

LNVAJADLLLIFCLPFRIMYHINQNKWTLGVIL 
CKA^GTLFYMNMYISLILLGFISLDRYIKINRSI 
OORKAITTKQSIYVCCIVW>v^AI.GGFl,TMIIL 
TLKKGGHNSTMCraYRDKHNAKGE.AIFNTIL 
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SEQID 


SEQ ID 


NO: of 


NO. of 


nucl- 


peptide 


eotide 


seq- 


seq- 


uence 


uence 





Met 
hod 



882 



883 



884 



2232 



2233 



885 



886 



887 



2234 



2235 



2236 



888 



889 



SEQ 
ID NO: 
in 

USSN 
09/496 
914 



Predicted 
beginning 
nucleotide 
location 
correspond i 
ng to first 
amino acid 
residue of 
peptide 
sequence 



7617 



7622 



7638 



67 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 



379 



400 



2640 



7642 



7692 61 



201 



2237 



2238 



2239 



A 7693 



7702 



7707 



215 



2861 



455 



85 



569 



315 



Amino acid sequence (A= Alanine C=Cysteine, 
EHAspartic Acid, E=Glutamic Acid, 
^Phenylalanine, G-Glycine, H«Histidine, 
I=Isoleucine, K=Lysine, L=Leucinc, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S= Serine, 
T=Threonine, V- Valine, W=Tryptophan, 
Y-Tyrosine, X=Unknown, *=Stop codon, 
/-^possible nucleotide deletion, possible 

nucleotide ins ertion 

VVMFWLIFLLIILSYlKIGiCNLLRISKRXSlU , "PN 
SGKY ATT ARN SFI VLIIFTICFVT YHAFRFIYISS 
QLNVSSCYWKEIVHKTNHMLVLSSFNSCLDP 
VMYFLMSSNIRKIMCQLLFRRF QGEP SRSEST 

SRFKP GYSLHDTSVAVKIQSSSKST 

RQMALLKANKDUSAGLKEFSVLLNQQVFNU 
PLVSEEDMVTVVEDWMNFYINYYRQQVTGE 
PQERDKALQELRQELNTLANPFLAKYRDFLK 

SHE LPSHPPPSS , 

KVKTCRYNPKYSAANDTGFVDIPSREKDLAK 
AVATVGPISVAVGASHVFFQFYKKGKHLSS 



APVLlLQMVKLSIVLTPQFLSHDQGQLTKJiLQ 
QHVKSVTCPCEYLRKVSECRQMGPGALEQFP 

GLSCHTSHSG 



PSRGKMELEAMSRYTSPVNPAVFPHLTVVLX 
AIGMFFTAWFFYYEVTSTKYTRCIYKELLISL 
VASLFMGFGVLFLLLWVGIYV 



APENPFSRQHFNSETKVKLSLKTGTWLGNHA 



242 



1298 



185 



2911 



HLGEHFSTHHELGLSGKWGFLVKN1LEVIRN 
GGMETRHPGKVSSWFHRWDSRAEQHNHAE 
HHEDVPQGDEDSKVSEAQQEFPDWTCAGLP 
GLLPKALRVLLFQLKVQHRPGIHQQRPEQQD 

VS DHRYGRSVRQNRK . 

NPGCCLPYAMRTSYLLLFTLCLLLSEMASGG 
NFLTGLGHRSDHYNCVSSGGQCLYSACPIFTK 
IQGTCYRGKAKCCK 



APSHRRRYLSPSRS AGQLGNMALERLCSi VLK 
VLLITVLWEGIAVAQKTQDGQNIGIKHIPAT 
OCG1WVRTSNGGHFASPNYPDSYPPNKECIYI 
LE AAPRQRI ELTFDEH YYIEPSFECRFDHLEVR 
DGPFGFSPLIDRYCGVKSPPLIRSTGRFMWIKF 
SSDEELEGLGFRAKYSFIPDPDFTYLGGILNPIP 
DCQFELSGADGIVRSSQVEQEEKTKPGQAVD 
CIWTIKATTKAKIYLRFLDYQMEHSNECKRNF 
VAVYDGSSSIEN1.KAKFCSTVANDVMLKTGI 
GVIRMWADEGSRLNRFRMLFTSFGGASPAQA 
ALSFCHSNMCINNSLVCNGVQNCAYPWDEN 

HC 



CHYlMNPSTHHPASAGGSILGLFDFFULULUt 

MTMDALLARLKLLNPDDLREEIVKAGLKCGP 

ITSTTRFIFEKKLAQALLEQGGRLSSFYHHEA 

GVTALSQDPQRILKPAEGNPTDQAGFSEDRDF 

GYSVGLNPPEEEAVTSKTCSVPPSDTDTYRAG 

ATASKJEPPLYYGVCPVYEDVPARNERIYVYE 

NKKE AL Q A VKMIKG SRFKAF STREDAEKF AR 

G1CDYFPSPSKTSLPLSPVKTAPLFSNDRLKDG 

LCLSESETVNKERANSYKNPRTQDLTAKLRK 

AVEKGEEDTFSDLIWSNPRYLIGSGDNPTIVQ 

EGCRYNVVIHVAAKENQASICQLTLDVLENP 

DFNOU.MYPDDDEAMLQKRIRYV\T)LYLNTP 

DKMGYDTPLHFACKFGNADWNVLSSHHLI 

VKNSRNKYDKTPEDVICERSKNKSVELKERIR 

EYLKGHYYVPLLRAEETSSPVIGELWSPDQTA 

EASHVSRYGGSPRDPVLTLRAFAGPLSPAKAE 

DFRKLWKTPPREKAGFLHHVKKSDPERGFER 

VGRELAHELGYPWVEYWEFLGCFVDLSSQE 

GLQRLEEYLTQQEIGKKAQQETGEREASCRD 

KATTSGSNS1SVRAFLDEDDMSLEEIKNRQNA 

A RNN SPPTVG AFGHTRCS AFPLEQE ADLIEAA 
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SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 



SEQ ID 
NO: of 
peptide 
seq- 
uence 



Met 
hod 



890 



891 



2240 



2241 



SEQ 


Predicted 


ID NO: 


beginning 


in 


nucleotide 


USSN 


location 


09/496 


correspondi 


914 


ng to first 




amino acid 




residue of 




peptide 




sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 



7711 



7721 



892 



893 



2242 



360 



61 



7723 



894 



2243 



7729 



2244 



895 



896 



2245 



2246 



3554 



7738 



269 
1175 



1650 



2419 



670 



7753 



7754 



119 



287 



278 



372 



Amino acid sequence (A=Alanine C=Cysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
F-Phenylalanine, G=*Glycine, H-Histidine, 
Wsoleucine, K=Lysine, L=Leucine, 
M^Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V-V aline, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/•^possible nucleotide deletion, ^possible 

nucleotide insertion 

EPGGPHSSRNGLCHPLNHSRTLAGKKPiCA^K 

GEEAHLPPVSDLTVEFDKLNLQN1GRSVSKTP 
DESTKTKDQ1LTSRINAVERDLLEPSPADQLG 
NGHRRTESEMSARIAKMSLSPSSPRHEDQLEV 
TREP ARRLFLFGEEPSKLDQDVLAALECADV 
DPHQFPAVHRWKSAVLCYSPSDRQSWPSPAV 
KGRFKSQLPDLSGPHSYSPGRNSVAGSNPAKP 
GLGS PGRYSPVHGSQLRRMARLAELAAL 
RHMPVIPALWEAEVGGLLEPRSSRSAWATC 



KLPWEPSFLIKMQIIRHSEQTLKTALISKNPYL 
VSQYEKLDAGEQRLMNEAFQPASDLFGPITL 
H SPSDW1TSHPE APQDFEQFF SDPYRKTPSPN 
KRSIYIQSIGSLGNTRIISEEYDCWLTGYCKAYF 
YGLRVKLLEPVPVSVTRCSFRVNENTHNLQIH 
AGDILKFLKKKKPEDAFCVVGITMIDLYPRDS 
WNFVFGQASLTDGVGIFSFARYGSDFYSMHY 
KGKVKKLKKTSSSDYS1FDNYYIPEITSVLLLR 
SCKTLTHEIGHIFGLRHCQWLACLMQGSNHL 
EEADRRPLNLCPICLHKLQCAVGFSIVERYKA 
LVRWIDDESSDTPGATPEHSHEDNGNLPKPV 

EAFKEWKEWIiKCLAVLQK 

SAPTAPARPCRAERGSGGGMLALLAASVALA 

VAAGAQDSPAPGSRFVCTALPPEAVHAGCPL 

PAMPMQGGAQSPEEELRAAVLQLRETWQQ 

KETLASARAIRELTGKLARCEGLAGGKARGA 

GATGKDTMGDLPRDPGHWEQLSRSLQTLK 

DRLESLEPLPAMPMQGGAQSPEEELRAAVLQ 

LRETWQQKETLASARAIRELTGKLARCEGL 

AGGKARGAGATGKDTMGDLPRDPGHWEQ 

LSRSLQTLKDRLESLEHQLRANVSNAGLPGD 

FREVLQQRLGELERQLLRKGAELEDEKSLLH 

NETSAHRQKTESTLNALLQRVTELERGNSAF 

KSPNAFKVSLPLRTNYLYGKJKKTLPELYAFT 

ICLWLRSSASPGMGTPFSYAVPGQANEIVLIE 

WGNNPIELLrNDKVAQLPLFVSDGKWHHICV 

TWTTRJDGMWEAFQDGKKLGTGENLAPWHPI 

KPGGVLILGQEQDTVGGRFDATQAFV GELSQ 

FNIWDRVLRAQEIVN1ANCSTNMPGNIIPVAD 

NNV PVFGGASKWPVETCEERLLDL 

LTAGTAMNYPLTLEMDLENLEDLFWELDKL 

DNYNDTSLVENHLCPATEGPLMASFKAVFVP 

VAYSL1FLLGV1GNVLVLV1LERHRQTRSSTET 

FLFHLAVADLLLVFILPFAVAEGSVGWVLGTF 

LCKTVIALHKVNFYCSSLLLACIAVDRYLAIV 

HAVHAYRHRRLLS1HITCGTIWLVGFLLALPEI 

LFAKVSQGHHNNSLPRCTFSQENQAETHAWF 

TSRFLYHVAGFLLPMLVMGWCYVG\ r VHRLR 

QAQRRPQRQKAVRVAILVTSIFFLCWSPYHIV 

IFLDTLARLKAVDNTCKLNGSLPVAITMCEFL 

GLA11CCLNPMLYTFAGVKFRSDLSRLLTKLG 

CT GPASLCQLFPSWRRSSLSESENATSLTTF 

FVTRAGRWGAGARVRGGAGGMASGAARWL 

VLAPVRSGALRSGPSLRKDGDVSAAWSGSGR 

SLVPSRSVrVTRSGAILPKPVKMSFGLLRVFSI 

VIPFLYVGTL1SKNFAALLEEHDEFVPEDDDDD 

D — 

APYAHSQVHCLDKVCGLLPFLNPEVFDQbYR 

LWLSLFL HAGKEAPHCPRTRPL 

SPAWWNSQQRWSPyLALLTLEPTFriHLLPM 
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SEQID | 
NO: of 1 
nucl- 
eotide 
seq- 
uence 


SEQID I 
NJO: of i 
>eptide 
seq- 
uence 


vlet $ 

lod ! I 
1 i 
1 1 
( 


;eq ■ i 

DKO: t 
n i 
JSSN 1 
)9/496 i 
914 


3 redicted I 

beginning 

lucleotide 

ocation 

:orrespondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
aciQ resiuuc 
of peptide 
sequence 


Amino acid sequence (A=Alamne OCysteine, 
>=Aspartic Acid, E=Glutamic Acid, 
F=*Phenyl alanine, OGlycine, H=Histidine, 
[=lsolcucinc, K-Lysine, L=Lcucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W^Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 














Q V STAAL A VLLCTMALCN Q VLS APL AAD 1 Y l 
ACCFSYTSRQIPQNFLADYFETSSQCSKPSVIFL 
TKRGRQVCADPSEEWVQKYVSDLELSA 


897 


2247 


A 


7761 


1725 


445 


RPRRRGTHHFSCVLGSrRVSAMFPRVSl hLPL 

RPLSRHPLSSGSPETSAAMMLLTVRHGTVRY 

RSSALLARTKNN1QRYFGTNSV1CSKKDICQSV 

RTEETSKETSESQDSEKENTKKDLLG11KGMK 

VELSTVNVRTTKPPKRRPLKSLEATLGRLRRA 

TEY APKJCRIEPLSPEL V AAAS AVADSLPFDKQ 

m crT t eoi OOHEEESRAQRDAKRPKISFSN1 

ISDMKVARSATARVRSRPELR1QFDEGYDNYP 

GOEKTDDLKiCRKNIFTGKRLNIFDMMAVTKE 

APETDTSPSLWDVEFAKQLATVNEQPLQNGF 

EELIQWTKEGKLWEFPINNEAGFDDDGSEFH 

ttutpt pkht FSFPKOGP1RHFMELVTCGLSKNP 

YLSVKQKVEH1EWFRNYFNEKKD1LKESN1QF 

KLRPWKFLFRNN 


898 


2248 


A 


7775 * 


85 


496 


SCQTTQPP AQSCSTGTMRIMLLF1 AlLAi" iLA 
OSFGAVCKEPQEEWPGGGRSKRDPDLYQLL 
ORLFKSHSSLEGLLKALSQASTDPKESTSPEK 
Dn x rrirvppvni MGKRSVQPDSPTDVNQENVP 
SFGILKYPPRAE 


899 


2249 


A 


7785 


179 


703 


PFHLGASSNTFRLQVQTQESKAQKEVKMLrt 1 

FSKSMNESMKNQKEFMLMNARLQLERQLIM 

QSEMRERQMAMQIAWSREFLKYFGTFFGLA 

A1SLTAGAIKKKKPAFLVPIVPLSFILTYQYDL 

GYGTLLERMKGEAEDILETEKSKLQLPRGMTT 

FEStEKARKEQSRFFIDK 


900 


2250 


A 


7789 


1465 


300 


"VWLPLKS YKJRSPSLHCQCEIFREEFLFS SLVJt 
GRDKDTFSKMAMVSEFLKQAWFIENEEQEY 
VQTVKSSKGGPGSAVSPYPTFNPSSDVAALH 
KAIMVKGVDEAT1ID1LTKRNNAQRQQIKAAY 
LQETQKPLDETLKKALTGHLEEWLALLKTP 
AOPnAHFT RAAMKGLGTDEDTLIEILASRTN 
KEIRDINRVYREELKRDLAKD1TSDTSGDFRN 
ALLSLAKGDRSEDFGVNEDLADSDARALYEA 
GERRKGTDVNVFNTELTTRSYPQLRRVFQKY 
TK Y S KHDMNKVLDLELKGDIEKCLT ATVKC A 
TSKPAFFAEKLHQAMKGVGTRHKALIRIMVS 
RSHDMNDDCAFYQKMYGISLCQAILDETKGD 

YEKILVALCGGN 


901 


2251 


A 


7796 


2 


807 


" VEFHPQRARAGARAPSMGVLLTQRTLLS1.V1. 
AT T FPSMASMAAIGSCSKEYRVLLGQLQKQT 
DLMQDTSRLLDPY1R1QGLDVPKLREHCRERP 
GAFPSEE7T.RGLGRRCFLQTLNATLGCVLHRL 
ADLEQRLPKAQDLERSGLNIEDLEKLQMARP 
N1LGLRNNIYCMAQLLDNSDTAEPTKAGRGA 
SQPPTPTPASDAFQRKLEGCRFLHGYHRFMH 
SVGRVFSKWGESPNRSRRHSPHQALRKGVRR 
TRPSRKGKRLMTRGQLPR 


902 


2252 


A 


7802 


2 


721 


" TAARRRQKGTAARRLQKGT AARRRQ^O i AA 
RRRQKGTAARRPQKGTAARRRQKGTAARRR 
QKGTA^RRRQKCTAARKrC^Nu l aakki^v^vj 
TAARRRQKGTAARRRQKGLA1ASRGCPCASR 
AGGVRGAGSRLRAMAPKVFRQYWDIPDGTD 
CHRKAYSTTSIASVAGLTAAAYRVTLNPPGTF 
LEGVAKVGQYTFTAAAVGAVFGLTTCISAHV 
REKPDDPLNYFLGGCAGGLTLGARTHNYGIG 
AAACVYFGIAASLVKMGRLEGWEVFAKPKV 



270 



PCT/USO 1/03800 

WO 01/57188 



SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 

seq- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A= Alanine C=Cysteine, 
I>Aspartic Acid, E=Glutamic Acid, 
^Phenylalanine, G-Glycine, H=Histidine, 
Msolcucinc, K-Lysine, L-Lcucinc, 
M=Mcthioninc, N=Asparagine, P=Proline, 
Q=Glutamine, R=Argmine, S=Serine, 
T=Threonine, V=V aline, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, V=possible 
nucleotide insertion 


903 


2253 


A 


7807 


1 ' 


584 


PWLPWSDGRAARSSRKCPRSRFPVQVGKMA 

VSTVFSTSSLMLALSRHSLLSPLLSVTSFRRFY 

RGDSPTDS QKDMIEIPLPP WQERTDESIETKR 

ARLLYESRKRGMLENCILLSLFAKEHLQHMT 

EKQLNLYDRLINEPSNDWDIYYWATEAKPAP 

EIFENEVMALLRDFAKNK>(KEQRLRAPDLEY 

T.FFXPR 


904 


2254 


A 


7813 


40 


821 


"GAGRALGHLETGAGDVAAALPARKFPRbLLU 
AGARLTGWTMNVFRILGDLSHLLAMILLLGK 
1WRSKCCKGISGKSQILFALVFTTRYLDLFTNF 
1 S IYNTVMKVVFLLC A YVTVTNGYGKFRKTF 
DSENDTFRLEFLLVP VI GLSFLENY SFTLLEIL 
WTFSIYLESVAILPQLFMISKTGEAETITTHYL 
FFLGL YRAL YLAN WIRRYQTENF YDQIA WS 
GWQTIFYCDFFYLYVTKGRSWDDSNADTGL 

RSYSSI 


905 


2255 


A 


7817 


1399 


881 


LSNKDVLSPQLKDENSKLRRKLNEVQSFSEA 

QTEMVRTLERKLEAKMDCEESDYHDLESVVQ 

Q VEQNLELMTKRA VKAENHVVKLKQEI SLL 

QAQVSNFQRENEALRCGQGASLTWKQNAD 

VALQNLRWMNSAQASIEQLVSGAETLNLVA 

EILKSIDRISEVKDEEEDS 


906 


2256 


A 


7822 


3 


1462 


DS PRNRFEILGRPTRTPTRPGPRPAMEDLD AL 

LSDLETTTSHMPRSGAPKERPAEPLTPPPSYG 

HQPQTGSGESSGASGDKDHLYSTVCKPRSPK 

PAAPAAPPFS SSSGVLGTGLCELDRLLQELNA 

TQFNITDEIMSQFPSSKVASGEQKEDQSEDKK 

RPSLPSSPSPGLPKASATSATLELDRLMASLSD 

FRVQNHLPASGPTQPP W S STNEG SPSPPEPTG 

KGSLDTMLGLLQSDLSRRGVPTQAKGLCGSC 

NKPIAGQVVTALGRAWHPEHFVCGGCSTAL 

GGSSFFEKDGAPFCPECYFERFSPRCGFCNQPI 

RHKMVTALGTHWHPEHFCCVSCGEPFGDEG 

FHEREGRPYCRRDFLQLFAPRCQGCQGPILDN 

Y1SALSALWHPDCFVCRECFAPFSGGSFFEHE 

GRPLCENHFHARRGSLCATCGLPVTGRCVSA 

LGRRFHPDHFTCTFCLRPLTKGSFQERAGKPY 

CQPCFLKLFG 


907 


2257 


A 


7828 


1792 


1671 


FIYVNQSFAPSPDQEVGTLYECFGSDGKLVLM 
YCKSQAWG 


908 


2258 


A 


7842 


110 


1172 


" KLSCPCSHGTRVTAVRGPRLKAGVQWHULU 
SLQPPPSGLKQSSHLSLSSSWDFRHAPTHPET 
YTCPKMIEMEQAEAQLAELDLLASMFPGENE 
LIVNDQLAVAELKDCIEKKTMEGRSSICVYFTI 
NMNLD V SDEKMAMF SL ACILPFK YP A VLPEI 
TVRSVLLSRSQQTQLNTDLTAFLQKHCHGDV 
CCLNATEWVREHASGYVSRDTSSSPTTGSTVQ 
SVDLIFTRLWIYSHHIYNKQCRKNILEWAKEL 
SLSGFSMPGKPGWCVEGPQSACEEFWARLR 
KLNWKRILIRHREDIPFDGTKDETERQRKFSIF 
EEKVFSVNGARGNHMDFGQLYQFLNTKGCG 
DVFQMFLWV 


909 


2259 


A 


7870 


3067 


2923 


— ur.rrwi'kl V WMYTRTCMHTYPYMYMNSV 
LISSEILLIPSKYLFESK 


910 


2260 


A 


7884 


212 


4874 


GALTWSHPLLAVCPQGVWLGSTPSGSPALLP 
PSHRVNAEPGCVVTNACASGPCPPHANCRDL 
WQTFSCTCQPGYYGPGCVDACLLNPCQNQG 
SCRHLPGAPHGYTCDCVGGYFGHHCEHRMD 
QQCPRGWWGSPTCGPCNCDVHKGFDPNCNK 
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SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 



SEQ ID 
NO: of 
peptide 
seq- 
uence 



Met 
hod 



911 



9i2 



2261 



2262 



SEQ 
ID NO: 
in 

USSN 
09/496 
914 



Predicted 

beginning 

nucleotide 

location 

correspond! 

ng to first 

amino acid 

residue of 

peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino ■ 
acid residue 
of peptide 
sequence 



7890 



21 



7891 



1263 



806 



111 



Amino acid sequence (A=Alanine C=Cysteine, 
D=Aspartic Acid, E-Glutamic Acid, 
F-Phcnylalaninc, GKjlycine, H-Histidinc, 
l=Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=G lutamine, R=Arginine, S=Serine, 
^Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/-possible nucleotide deletion, V=possible 
nucleotide insertion 

TNGQCHCKEFHYRPRGSDSCLPCDCYPVGST 

SRSCAPHSGQCPCRPGALGRQCNSCDSPFAEV 

TASGCRVLYDACPKSLRSGVWWPQTKFGVL 

ATVPCPRGALGLRGAGAAVRLCDEAQGWLE 

PDLFNCTSPAFRELSLLLDGLELNKTALDTME 

AKKLAQRLREVTGHTDHYFSQDVRVTARLL 

AHLLAFESHQQGFGLTATQDAHFNENLLWA 

G S ALL APETGDL W AALGQRAPGG SPGS AGL V 

RHLEEYAATLARNMELTYLNPMGLVTPNTML 

S1DRMEHPSSPRGARRYPRYHSNLFRGQDAW 

DPHTHVLLPSQSPRPSPSEVLPTSSSIENSTTSS 

WPPPAPPEPEPGIS1I1LLVYRTLGGLLPAQFQ 

AERRGARLPQNPVMNSPWSVAVFHGRNFLR 

GILESPISLEFRLLQTANRSKAICVQWDPPGLA 

EQHGVWTARDCELVHRNGSHARCRCSRTGT 

FGVLMDASPRERLEGDLELLAVFTHVWAVS 

VAALVLTAAILLSLRSLKSNVRGIHANVAAA 

LGVAELLFLLGIHRTHNQLVCTAVVTLLHYFF 

LSTFAWLFVQGLHLYRMQVEPRNVDRGAMR 

FYHALGWGVPAVLLGLAVGLDPEGYGNPDF 

CWISVHEPLIWSFAGPWLVrVMNGTMFLLA 

ARTSCSTGQREAKKTSALTLRSSFLLLLLVSA 

SWLFGLLAVNHSILAFHYLHAGLCGLQGLAV 

LLLFCVLNADARAAWMPACLGRKAAPEEAR 

PAPGLGPGAYNNTALFEESGLIRITLGASTVSS 

VSSARSGRTQDQDSQRGRSYLRDNVLVRHGS 

AADHTDHSLQAHAGPTDLDVAMFHRDAGA 

DSD SD SDL SLEEERSL SIPSSESEDNGRTRGRF 

QRPLCRAAQSERLLTHPKDVDGNDLLSYWPA 

LGECEAAPCALQTWGSERRLGLDTSKDAAN 

NNQPDPALTSGDETSLGRAQRQRKGILKNRL 

QYPLVPQTRGAPELSWCRAATLGHRAVPAAS 

YGRIYAGGGTGSLSQPASRYSSREQLDLLLRR 

QLSRERLEEAPAPVLRPLSRPGSQECMDAAPG 

RLEPKDRGSTLPRRQPPRDYPGAMAGRFGSR 

DALDLGAPREWLSTLPPPRRTRDLDPQPPPLP 

LSPQRQLSRDPLLPSRPLDSLSRSSNSREQLDQ 

VPSRHPSREALGPLPQLLRAREDSVSGPSHGP 

STEQLDILSSILASFNSSALSSVQSSSTPLGPHT 

TATPSATASVLGPSTPRSATSHSrSELSPDSEPR 

DTQALLSATQAMDLRRRDYHMERPLLNQEH 

LEELGRWGSAPRTHQWRTWLQCSRARAYAL 

LLQHLPVLVWLPRYPVRDWLLGDLLSGLSVA 

IMQLPQGLAYALLAGLPPVFGLYSSFYPVFIY 

FLFGTSRHISVESLCVPGPVDT 

EFGTSRSSRSMAEDLGLSFGETASVEMLPtHU 

SCRPKARSSSARWALTCCLVLLPFLAGLTTYL 

LVSQLRAQGEACVQFQALKGQEFAPSHQQV 

YAPLRADGDKPRAHLTWRQTPTQHFKNQFP 

ALHWEHELGLAFTKNRMNYTNKFLLIPESGD 

YFIYSQVTFRGMTSECSEIRQAGRPNKPDSITV 

VITKVTDSYPEPTQLLMGTKSVCEVGSNWFQ 

PIYLGAMFSLQEGDKLMVNVSDISLVDYTKE 

D KTFFGAFLL 

ACGIRHEGALPGLTATPEAMLRFLPDLAhb^L 

LILALGQAVQFQEYVFLQFLGLDKAPSPQKFQ 

PVPYELKKIFQDREAAATTGVSRDLCYVKELG 

VRGNVLRFLPDQGFFLYPKK1SQASSCLQKLL 

YFNLSAIKEREQLTLAQLGLDLGPNSYTNLGP 

ELELALFLVQEPHVWGQTTPKPGKJ^IFVLRSV 
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SEQ ID 5 
NO: of T 
nucl- ] 
eotide ! 
seq- ' 
uence 


iEQH) } 

<0: of \ 
peptide 
*q- 
jence 


lod I 
i 
1 
( 


iEQ I 
DNO: t 
n i 
JSSN I 
)9/496 ( 
?14 


>redicted I 
>eginning i 
nucleotide 
ocation 
;orrespondi 
ng \o first 
amino acid 
residue of 
peptide 
sequence 


Predicted end i 
nucleotide 
ocation 
;orrespond ing 
to last amino 
acid residue 
of peptide 
sequence 


\mino acid sequence (A-Alanine OCysteine, 
>Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
=lsoleucine, K—Lyswe, i^j-zcm-iiic, 
M-Methioninc, N-Asparagine, P-Proline, 
QMjlulamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophai% 
Y^Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 

r,^^o a i rr TT-vh T r\\7 A VOWKTYMPK K NFtTLFL 














PWPQGAVHl"NLLUVAi^wrsur^rivisj^rvji-»i ^ 
ETLVKEDRDSGVNFQPEDTCARLRCSLHASLL 
VVTLNPrX^HPSRKRRAAIPVPKLSCKNLCH 
RHOLFINFRD1J3 WHKWIIAPKGFN1ANY CHGE 
CPFSLTISLNSSNY AFMQ ALMHAVDPEIPQ AV 
ClPTKLSPISMLYODNNDNVlLRHYEDNTv^ 

VCGCG 


913 


2263 


A 


7892 


15 


849 


ASRLPRGPGCGADMRPLLGLLLVFACHJ1 1 

YLLSTRLPRGRRLGSTEEAGGRSLWFPSDLAE 

LRELSE VLREYRKEHQ AYVFLLFCG A Y L Y ka<j 

GFATPGSSFLNVLAGALFGPWLGLLLCCVLTS 

VGATCCYLLSSIFGKQLWSYFrUK.vALLV^K 

KVEENRNSLJFFFLLFIJUJ^MTPNWFLNLSAP] 

LNIP1VQFFFSVLIGLIPYNFICVQTGSILSTLTS 

LDALFSWDTVFKLLA1AMVALIPGTLIKKFSQ 

KU1 OLNETST ANHIHSRJCDT 


914 


2264 


A 


7893 


815 


959 


KSGWVWWLTPLIPALWEAQTEGSLRPEVKJS 
Rl.SNTTRPFFSKKKKILV 


915 


2265 


A 


7909 


3 


641 


HASGPGGLLRRRKU^UANMP V ARS W V CKK. 1 

YVTPRRPFEKSRLDQELKLIGEYGLRNKREV 

WRVKFTLAKIRKAARELLTLDEKDPRRLFEG 

NALLRRLVR1GVLDEGKMKLDYILGLKIEDFL 

ERRLQTQVFKLGLAKSIHHAHVLIQQCHIRVR 

EQWNTJLFFTVRLDSQKHIDFSLCFPIGVANPS 

HVKRKNASKGOGGAGARDDEEEE 


916 


2266 


A 


7914 


3 


967 


VAHTQWHTCQRLSQLTHKSILKYLL1DTHAU 

OVLILKHTHASLSLPSCQECFPSSIPSASHMVS 

HPHPPPSPRWGQTPEGLPAASPCGPGPRSCFS 

SILPTGDSWGMLACLCTVLWHLPAVPALNRT 

GDPGPGPSlQKTYDLTRYLEHQLRbLAU 1 Y lin 

YLGPPFNEPDFNPPRLGAETLPRATVDLEVW 

RSLNDKLRLTQNYEAYSHLLCYLRGLNRQAA 

TAELRRSLAHFCTSLQGLLGS1AGVMAALGY 

PLPOPLPGTEPTWTPGPAHSDFLQKMDDFWL 

LKELQTWLWRSAKDFNRLKKXMQPPAAAVT 

] ht r,AuaF 


917 


2267 


A 


7921 


2 


1166 


" RPRRGQGLVQEVQTENVTVAEGG VAtl 1 CRL 
HOYDGSIWIQNPARQTLFFNGTRALKDERFQ 
LEEFSPRRVRIRLSDARLEDEGGYFCQLYTED 
THHQIATLTVLVAPENPWEVREQAVEGGEV 
ELSCLVPRSRPAATLRWYRDRKELKGVSSSQ 
ENGKVWSVASTVRFRVDRKDDGGfflCEAQN 
OALPSGHSKQTQYVLDVQ Y £>Fi aklha^va v 
VREGDTLVLTCAVTGNPRPNQIRWNRGNESL 
PERAEAVGETLTLPGLVSADNGTYTCEASNK 
HGHARALYVLVVYGESRLRPTEGGGGAPDP 
GAVVEAQTSVPYAIVGGILALLVFUICVLVG 
MVWCSVRQKGSYLTHEASGLDEQGEAREAF 

LNGSDGHKRKEEFFI 


918 


2268 


A 


7938 


3 


2653 


" RRRLPPASPPSSSVSSSLSPSAWMACRWb 1 N 
ESPRWRSALLLLFLAGVYGNGALAEHSENVH 
ISGVSTACGETPEQrRAPSGriTSPGWPSEYPAK 
INCSWFIRANPGEimSFQDFDIQGSRRCNLD 
WLTIETYKNIESYRACGSTEPPPYISSQDHIWIR 
FHSDDNI SRKGFRL A YFSGKSEEPNCACDQFR 
CGNGKC1PEAWKCNNMDECGDRSDEEICAKE 
ANPPTAAAFQPCAYNQFQCLSRFTKVYTCLP 
ESLKCDGNIDCXDLGDEIDCDVPTCGQWLKY 
FYGTFNSFNYPDFYPPGSNCTWLIDTGDHRK 



273 



WO 01/57188 



PCT7US0 1/03800 



SEQ ID 
NO: of 
nucl~ 
eotidc 
seq- 
uence 


SEQ ID 
NO: of 
Deotide 

r r 

seq- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspond! 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A-Aianine OCysteine, 
D=Aspartic Acid, E=GIutamic Acid, 
F=Phenylalanine, G=Glycinc, H=Histidine, 
HLsoleucinc, K=Lysine, L-Leucinc, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutarnine, R^Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \-possible 
nucleotide insertion 














VILRFTDFKLDGTGYGDYVKJYDGLEENPHK 

LLRVLTAFDSHAPLTWSSSGQIRVHFCADKV 

NAARGFNATYQVDGFCLPWEIPCGGNWGCY 

TEQQRCDGYWHCPNGRDETNCTMOQKEEFP 

CSRNGVCYPRSDRCNYQNHCPNGSDEKNCFF 

CQPGNFHCKNNRCVFESWVCDSQDDCGDGS 

DEENCPVIVPTRVITAAVIGSLICGLLLV1ALG 

CTCKLYSLRMFERRSFETQLSRVEAELLRREA 

PPSYGQLIAQGLIPPVEDFPVCSPNQASVLENL 

RLAVRSQLGFTSVRLPMAGRSSNIWNRIFNFA 

RSRHSGSLALVSADGDEWPSQSTSREPERNH 

THRSLFSVESDDTDTENERRDMAGASGGVAA 

PLPQKVPPTTAVEATVGACASSSTQSTRGGH 

ADNGRDVTSVEPPSVSPARHQLTSALSRMTQ 

GLRWVRFTLGRSSSLSQNQSPLRQLDNGVSG 

REDDDDVEMLIPISDGSSDFDVNDCSRPLLDL 

ASDQGQGLRQPYNATNPGVRPSNRDGPCERC 

GIVHTAQIPDTCLEVTLKNETSDDEALLLC 








7951 


1674 


1839 


WRVTCCPPARSTTERTNAYDEEDCVEMVAb 
GG WNDVACHTTMYFMCEFDKKNM 


920 


2270 


A 


7953 


47 


572 


GGRASWPEQAKEPRREGHTDKQQTEDVLAA 

GLRCLPHLPAICARRMSPAFRAMDVEPRAKG 

VLLEPFVHQVGGHSCVLRFNETTLCKPLVPRE 

HQFYETLPAEMRKFTPQYKGKSQLLEGLPHW 

RGDVRDRGHGRPWQPSLEPSLPPTLCFPSLSS 

FSSSWPSAQHLTPSVFNPW 


921 


2271 


A 


7957 


612 


812 


RSGRTVVTGlGYSKALQSSNRNTKSLLQNEh 
MMVYSFRALSFKESTWATFQHGGEATKSRSL 

SSTQ 


922 


2272 


A 


7967 


1443 


1660 


ENITEKWKEIWMCRGNKKSCCWTFIKDRHLl 
VSCCKSKSGETLLICIFCSNLVGFFFFGIRGFSN 
WELVKPN 


923 


2273 


A 


7981 


1 


3023 


GSAPRAATAMARARPPPPPSPPPGLLPLLPPLL 

LLPLLLLPAGCRALEETLMDTKWVTSELAWT 

SHPESGWEEVSGYDEAMNPIRTYQVCNVRES 

SQNNWLRTGFIWRRDVQRVYVELKFTVRDC 

NSIPNTPGSCKETFNLFYYEADSDVASASSPFW 

MENPYVKVDTIAPDESFSRLDAGRVNTKVRS 

FGPLSKA GFYL AFQDQGACMSL I S VRAF YKK 

CASTTAGFALFPETLTGAEPTSLV1APGTCIPN 

AVEVSVPLKLYCNGDGEWMVPVGACTCATG 

HEPAAKESQCRPCPPGSYKAKQGEGPCLPCPP 

NSRTTSPAASICTCHNNFYRADSDSADSACTT 

VPSPPRGVISNVNETSLILEWSEPRDLGVRDD 

LLYNVICKKCHGAGGASACSRCDDNVEFVPR 

QLGLSEPRVHTSHLLAHTRYTFEVQAVNGVS 

GKSPLPPRYAAVNTTTNQAAPSEVPTLRLHSS 

SGSSLTLSWAPPERPNGVILDYEMXYFEKSEG 

IASTVTSQMNS VQLDGLRPD AR YWQ VRART 

VAGYGQYSRPAEFETTSERGSGAQQLQEQLP 

LIVGSATAGLVFWAVW1AIVCLRKQRHGS 

DSEYTEKLQQY1APGMKVYIDPFTYEDPNEA 

vp ffa JCFIDVSCVKIEEVIGAGEFGE VCRGRL 

KQPGRREVFVAIKTLKVGYTERQRRDFLSEA 

SIMGQFDHPN1IRLEGWTKSRPVM1LTEFME 

NCALDSFLRLNDGQFTVIQLVGMLRGIAAGM 

KYLSEMNYVHRDLAARNILVNSNLVCKVSDF 

GLSRFLEDDPSDPTYTSSLGGKIPIRWTAPEA1 

AYRKFTSASDVWSYGIVMWEVMSYGERPY 
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SEQ1D 
NO: of 
ucl- 
eotidc 
seq- 
uence 



SEQ ID 
NO: of 
peptide 
seq- 
uence 



Met 
hod 



924 



925 



2274 



SEQ 
ID NO: 
in 

USSN 
09/496 
914 



926 



927 



928 



929 



930 



931 



2275 



2276 



2277 



2278 



2279 



A 7985 



Predicted 

beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 



7994 



7996 



7998 



2280 



8004 



447 
92T 



503 



130 



8007 



2281 



8008 



8009 



589 



582 



353 



588 



1016 



861 



1679 



300 



Amino acid sequence (A= Alanine OCysteine, 
D=Aspartic Acid, E=Glutaraic Acid, 
F-Phenylalaninc, G-Glycine, H-Histidinc, 
Msoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serinc, 
T=Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X-Unknown, *=Stop codon, 
/possible nucleotide deletion, \=possible 

nucle otide insertion 

WDMSNQDVTNAVEQDYRLPPPMDCPI ALHQ 



WUMO^V^ v ^v A ^ 1 * - — ^ 

LMLDCWVRDRNLRPKJSQIVNTLDKXIRNAA 
SLKVIASAQSGMSQPLLDRTVPDYTTF 1 1 VGD 
WLDA1KMGRYKESFVSAGFASFDLVAQMTA 
EDLLRJGVTLAGHQKKILSSIQDMRLQMNQT 

LPVQV 

FRPRTKKATAMYLEHYLDSIENLPCELQRNF 

QLMRELDQRTEDKKAEIDILAAEYISTVKTLS 

PDQRVERLQKIQNAYSKCKEYSDDKVQLAM 

QTYEMVDKHIRRLDADLARFEADLKDKMEG 

SDFESSGGRGLKKGRGQKEKRGSRGRGRRTS 

EEDTPKKKK HKGG 

LPCSFC AQCMSSFERVWLQQSHFHNPR W N SK 



SPIRC YCQHWPHCVHC 

GPCKVCCn'LAIMLQCHSF Y RKD VQVEHPKS 

LNPKYSQIENFLSADMALKRKCLLSISDLDFW 

IWDAQPVGIMQTLQNLKKIPNPGCFWSQAFQI 

RDTOP ILPLGGRYYTTIRQ 

RJQRPLNSRSPNHSLFVKAELTAKQATMKJJS v 

CLLLVTLALCCYQANAEFCPALVSELLDFFFI 

SEPLFKLSLAKFDAPPEAVAAKLGVKRCTDQ 

MSL QKRSLIAEVLVKJLKKCSV 

LAPLRCQPGTRTQPRSHPAANDPSAAMSAACi 
ARGLRATYHRLLDKVELMLPEKLRPLYNHPA 
GPRTVFFWAPIMKWGLVCAGLADMARPAEK 
LSTAQSAVLMATGFIWSRYSLVTIPKNWSLFA 
VNFFVGAAGASQLFRIWRYNQELKAKAHK 
EFARRRVFIAAREMSLLRSLRVFLVARTUS 



AGSLLRQSPQPRHTFYAGPRLSASASSKELLM 
KLRRKTGYSFVNCKKALETCGGDLKQAEIWL 
HKEAQKEGWSKAAKLQGRKTKEGLIGLLQE 
GNTTVLVEVNCETDFVSRNLKFQLLVQQVAL 
GTMMHCQTLKDQPSAYSKGFLNSSELSGLPA 
GPDREGSLKI}QLAI^GKXGENMILKJtAAWV 
KVPSGFYVGSYVHGAMQSPSLHKLVLGKYG 
ALVICETSEQKTNLEDVGRRU3QHWGMAPL 
SVGSLDDEPGGEAETKMLSQPYLLDPSITLGQ 
YVQPQGVSWDFVRFECGEGEEAAETE 



NSRVWGPWTEPSAGSLRPMARKQNRNSKEL 
GLVPLTDDTSHAGPPGPGRALLECDHLRSGV 
PGGRRRKDWSCSLLVASLAGAFGSSFLYGYN 
LSVVNAPTPYIKAFYNESWERRHGRPIDPDTL 
TLLWSVTVSIFAIGGLVGTLIVKMIGKVLGRK 
HTLL ANN GF AI S AALLMACSLQ AG AFEML IV 
G RFIMGIDGG VALS VLPMYLSEISPKEIRG SLG 
QVTAIFICIGVFTGQLLGLPELLGKESTWPYLF 
GV1WPAWQLLSLPFLPDSPRYLLLEKHNEA 
RAVKAFQTFLGKADVSQEVEEVLAESRVQRS 
IRLVSVLELLRAPYWWQVVTVIVTMACYQL 
CGLNAIWFYTNSIFGKAGIPPAK1PYVTLSTGG 
DETLAAVFSGLVIEHLGRRPLLIGGFGLMGLFF 
GTLTITLTLQDHAPWVPYLSIVGILAIIASFCSG 
PGGIPFILTGEFFQQSQRPAAFIIAGTVNWLSN 
FAVGLLFPFIQFCSLDTYCFLVFATICITGArYL , 
YFVLPETKNRTYAEISQAFSKRNKAYPPEEKI 
DSAVTDGKJNGRP 



AAGAWSAMPKAKGKTRRQKFGYSVNRKKL 
NRNARRKAAPRIECSHIRHA^T>HAKSVRQNL 
A EMGI AVDPNRAVPLRKRKVKAMEVDIEER 
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SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 

ID NO: I 
in 

USSN 
09/496 
914 


Predicted 
beginning 
nucleotide 
location 
correspondi 
ng to first 
amino acid 
residue of 
peptide j 
sequence 


Predicted end 
nucleotide 

corresponding 
to last amino 
acid residue 

sequence 


Amino acid sequence (A=Alanine C=Cysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
F=Phcnylalanine, G=Glycine, H^Histidinc, 
I=Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=V aline, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possiblc 
nucleotide insertion 














PKELVRJCPYVLNDLEAEASLPEKKGNTLSRD 
LIDYVRYMVENHGEDYKAMARDEKNYYQD 
TPKQIRSKJNVYKRFYPAEWQDFLDSLQKRK 

MEVE 


932 


2282 


A 


8011 


412 


I 


"SNLCLGNSWKWRWAKSRHHCIPTVTLSKRSG 

DIRGSHFSSPQRQRSQRVPGKETARVLRAGK 

QGRGQ1P1PCPWPPPPPPPPPGSPGPGCRQFHQ 

SLEAKARHPASVREMRGKVKMRRALRRAPA 

STRASSRQPNPK 


933 


2283 


A 


8012 


147 


1077 


PPVPPASRSDMAQNLKDLAGRLPAGPRGMGT 

ALKLLLGAGAVAYGVRESVFTVEGGHRAIFF 

xit?Tr;r;\/onnTiT AFflT HFR1PWFOYPI1YDIRA 

RPRKIS SPTGSKDLQMVNISLRVLSRPNAQEL 

PSMYQRLGLDYEERVLPSIVNEVLKSWAKF 

NASQLITQRAQVSLLIRRELTERAKDFSLILDD 

VAITELSFSREYTAAVEAKQVAQQEAQRAQF 

LVEKAKQEQRQKIVQAEGEAEAAKMLGEAL 

SKNPGYIKLRKIRAAQN1SKTIATSQNRIYLTA 

DNLVLNLQDESFTRGSDSLIKGKK 


934 


2284 


A * 


8023 


255. 


982 


- oncC1 qavt vn^APFfr^T AAA AFT AAOKREO 
RLRKFRELHLMRNEARKLNHQEWEEDKRL 
KLPANWEAKKARLEWELKEEEKKKECAARG 
cnvPk-viiri I FTSAFnAFRWERJGCKRKKPDLG 
FSDYAAAQLRQYHRLTKQIKPDMETYERLRE 
KHGEEFFPTSNSLLHGTHVPSTEEIDRMVTDLE 
KQIEKRDKYSRRRPYNDDADIDYINERNAKF 
NKXAERFYGKYTAEIKQNLERGTAV 


935 


2285 


A 


8027 


59 


310 


LVSSTVNLLTEKAPWNSLAWTVTSYVFLKFL 
QGGGTGSTGMRDSALTLLGIGPSHRHSLSERL 
SOHS SPAPMYSQTFHILVLG 


936 


2286 


A 


8032 


1 


639 
111 


S G RECNMAKT YD YLFKLLL1 GD SG VGKTC V L 

FRPSEDAFNSTFISTIGIDFKIRTIELDGKRIKLQ 

IWDTAGQERFRTITTAYYRGAMGIMLVYDIT 

NEKSFDNIRNWIRNffiEHASADVEKMILGNKC 

DVNDKRQVSKERGEKLALDYGIKFMETSAK 

ANINVENAFrTLARDEKAKMDKKLEGNSPQG 

SNQGVKITPDQQKRSSFFRCVLL 

F FTTH SEN S YILEK YIPI S ANLTLTI A 


937 
938 


2287 
2288 


A 
A 


8039 
8052 


393 
675 


31 1 
1334 


LHPAATSTAWLHVPPGLSMALSWVLTVLSLL 

PLLEAQIPLCANLVPVPrTNATLDRITGKWFYI 

ASAFRNEEYN1C.SVQEIQATFFYFTPNKTEDTIF 

LREYQTRQDQCIYNTTYLNVQRENGTISRYV 

GGQEHFAHLL1LRDTKTYMLAFDVNDEKNW 

GLSVYADKPETTKEQLGEFYEALDCLRIPKSD 

VVYTDWKKDKCEPLEKQHEKERKQEEGES 


939 


2289 


A 


8055 


12 


1039 


'SSVAEFPERVQLSQPQNWNFSGAGGAWSLDF 

AEQLKWSAELARLGESIMDGKQGGMDGSKP 

AGPROFPGIRLLSNPLMGDAVSDWSPMHEAA 

IHGHQLSLRNLISQGWAVNIITADHVSPLHEA 

CLGGHLSCVKILLKHGAQVNGVTADWHTPL 

FNACVSGSWDCVNLLLQHGASVQPESDLASP 

IHEAARRGHVECVNSLIAYGGNIDHKISHLGT 

PLYLACENQQRACVKKLLESGADVNQGKGQ 

DSPLHAVARTASEELACLLMDFGADTQAKN 

AEGKRPVELVPPESPLAQLFLEREGPPSLMQL 

CRLRrRKCTGIQQHHKITKLVLPEDLICQFLLH 

L 


940 


2290 


A 


8058 


2 


1203 


KVLSIREPAHSTARKASEPSQPSQPSQPGGHU 
AJU.RTMDLFILFDYSEPGNFSD1SWPCNSSDC1 
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SEQ ID 
KO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 

beginning 

nuiicuuuc 

location 

correspond! 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
F-Phcnylalaninc, OGlycine, H-Histidinc, 
I=Isolcucine, K=Lysinc, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T-Threonine, V=Valine, W^Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, S-=possible 
nucleotide insertion 














"VVDTVMCPNMPNKSVLLYTLSFIYIFIFVICjMI 
ANSWVWVN1QAKTTGYDTHCY1LNLA1ADL 
\VV\XTCPVWVVSLVQHNQWPMGELTCKVTH 
LlFSINLFGSIFFLTCMSVDRYLSrrYFTNTPSS 
RKKMVRRVVCILVWLLAFCVSLPDTYYLKT 
VTSASNNETY CRSFYPEHSIKEWLIGMELVS V 
VLGFAVPFSIIAVFYFLLARAISASSDQEKHSS 
RKJIFSYVWFLVCWLPYHVAVLLDIFSELHYI 
PFTCRLEHALFTALHVTQCLSLVHCCVNPVL 
YSFINRNYRYELMKAFIFKYSAKTGLTKLIDA 
SRVSETEYSALEQSTK 


941 


2291 


A 


8059 


73 


432 


DMAGLMTIVTSLLFLGVCAHHI1PTGSVVLFS 
PCCMFFVSKRIPENRVVSYQLSSRSTCLKAGV 
IFTTKKGQQFCGDPKQEWVQRYMKNLDAKQ 
KKASPRARA V A VKGP VQR YPGNQTTC 


942 


2292 


A 


8067 


278 


1262 


GGIGE1KQRPSCLGRCLDPSLSVLMNISLGLGS 

VFSAVISQKPSRDICQRGTSLTIQCQVDSQVT 

MMFWYRQQPGQSLTLIATANQGSEATYESGF 

VIDKFPISRPNLTFSTLTVSNMSPEDSSIYLCSA 

GRQGTYEQYFGPGTRLTVTEDLKNVFPPEVA 

VFEPSEAEISHTQKATLVCLATGFYPDHVELS 

WWVNGKEVHSGVSTDPQPLKEQPALNDSRY 

CLSSRLRVSATFWQNPRNHFRCQVQFYGLSE 

NDEWTQDRAKPVTQIVSAEAWGRADCGFTS 

ESYQQGVLSATILYEILLGKATLYAVLVSALV 

LMAMVKRKDSRG 


943 


2293 


A 


8070 


1 


879 


MVKVVPATRGNLPRSQLTGTHQHCQPRKPKl 

TASERLRRRPRATARLRAHAAPPEPPLAVFAP 

PSDRKELLALPVACDPVIASVMSWVQAASLI 

QGPGDKGDVFDEEADESLLAQREWQSNMQR 

RVKJEGYRDGIDAGKAVTLQQGFNQGYKKGA 

EVILNYGRLRGTLSALLSWCHLHNNNSTLINK 

INNLLDAVGQCEEYVLICHLKSITPPSHWDLL 

D S IEDMDLCH V VP AEKKJDEAKDERLCENNA 

EFNKNCSKSHSGIDCSYVECCRTQEHAHSGK 

PKPHMDFGTDSQF 


944 


2294 


A 


8073 


1 


797 


ESARWSRQLRRTURLSFPISCGRSHAFUOUK 

MAATSGTDEPVSGELVSVAHALSLPAESYGN 

DPDIEMAWAMRAMQHAEVYYKLISSVDPQF 

LKLTK VDDQ I Y SEFRKNFETLRID VLDPEELK 

SESAKEKWRPFCLKFNGIVEDFNYGTLLRLD 

CSQGYTEENTIFAPRIQFFAIEIARNREGYNKA 

VY1SVQDKEGEKGVNNGGEKRADSGEEENT 

KNGGEKGADSGEEKJEEGINREDKTDKGGEK 

GKEADKEINKSGEKAM 


945 


2295 


A 


8074 


2 


505 


■ GAATLLRSASSAARKAAEAEQVWLHLHRYL 
SADRRVLGLREWGRPASERECSLCQRLKREL 
NMGDVEKGKKJFIMKCSQCHTVEKGGKHKT 
GPNLHGLFGRKTGQAPGYSYTAANKNKGHW 
GEDTLMEYLENPKKYIPGTKMIFVGIKKKEER 
ADL1AYLKKATNE 


946 


2296 


A 


8081 


42 


590 


EGRRGKFGGKXCNFLFYFHSNS AESRMD VLt 

VAJFA VPLiLGQEYbUbbRLutUt, i i V v v 1 1 

YTVTPSYDDFSADFTTOYSIFESEDRLNRLDK 

DITEAffiTTlSLETARAI)HPKPVTVKPVTTEPQ 

SPRSEAMPCPVLRSPIPLPPVRVPLFRWGCISC 

KKVGRRLLMTLWMGVWQEEIGR 


947 


2297 


A 


8084 


322 


549 


" "GGGSSPRELAGAAGLTVTSQAVAARRQQP^ 
S RARAP AH SLRAALS L AS S ARS WG A V SRDRG 
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nucl- 
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seq- 
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NO: of 
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seq- 
uence 
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hod 



948 



2298 



SEQ 
ID NO: 
in 

USSN 
09/496 
914 



8093 



Predicted 
beginning 
nucleotide 
location 
correspond i 
ng to first 
amino acid 
residue of 
peptide 
sequence 



3905 



949 



2299 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 



Amino acid sequence (A=Alanine C=Cysteuie, 
D=Aspartic Acid, EKilutamic Acid, 
^Phenylalanine, G=01ycine, H=Histidine, 
I=lsoleucine, K-Lysine, L=Leucinc, 
M=Mcthionine, N=Asparaginc, P=Proline, 
Q=Glutamine, R=Arginine, S= Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 



846 



8095 



2374 



PCPPAIMYQSSNKC 



MEPGE VKDRILENI SL S VKJKJLQS YF AACEDEI 
PAIRNHDKVLQRLCEHLDHAiXYGLQDLSSG 
YWVLVVHFTRREAIKQIEVLQHVATNLGRSR 
AWLYLALNENSLESYLRLFQENLGLLHKYYV 
KNALVCSHDHLTLFLTLVSGLEFIRFELDLDA 
PYLDLAPYMPDYYKPQYLLDFEDRLPSSVHG 
SDSLSLNSFNSVTSTNLEWDDSAIAPSSEDYD 
FGD VFP A VPS VPSTD WEDGDLTDTVS GPRST 
ASDLTSSKASTRSPTQRQNPFNEEP AETVSS S 
DTTPVHTTSQEKEEAQALDPPDACTELEVIRV 
TKJKJCKJGKKKKSRSDEEASPLHPACSQKKCA 
KQGDGDSRNGSPSLGRDSPDTMLASPQEEGE 
GPSSTTESSERSEPGLLIPEMKDTSMERLGQPL 
SKVIDQLNGQLDPSTWCSRAEPPDQSFRTGSP 
GDAPERPPLCDFSEGLSAPMDFYRFTVESPST 
VTSGGGHHDPAGLGQPLHVPSSPEAAGQEEE 
GGGOEGQTPRPLEDTTREAQELEAQLSLVRE 
GPVSEPEPGTQEVLCQLKRDQPSPCLSSAEDS 
G VDEGQGSP SEMVH S SEFRVDNNHLLLLMIH 
VFRENEEQLFKMIRMSTGHMEGNLQLLYVLL 
TDCYVYLLRKGATEKPYLVEEAVSYNELDY 
VSVGLDQQTVKLVCTNRRKQFLLDTADVAL 
AEFFLASLKSAM1KGCREPPYPSILTDATMEK 
LALAKFVAQESKCEASAVTVRFYGLVHWED 
PTDESLGPTPCHCSPPEGTITKEGMLHYKAGT 
SYLGKEHWKTCFWLSNGILYQYPDRTDV1P 
LLSVNMGGEQCGGCRRANTTDRPHAFQVILS 
DPPCLELSAESEAEMAEWMQHLCQAVSKGVI 
PQGVAPSPCIPCCLVLTDDRLFTCHEDCQTSF 
FRSLGTAKLGDISAVSTEPGKEYCVLEFSQDS 
QQLLPPWVIYLSCTSELDRLLSALNSGWKTIY 
QVDLPHTAIQEASNKKKFEDALSLIHSAWQR 
SDSLCRGRASRDPWC* 



ARRADTVLLESPSMLQGLXPVSLLLSVAVSAI 
KELPGVKKYEVVYPIRLHPLHKREAKEPEQQ 
EQFETELKYK^INGKUVLYLKKNKNLLAP 
GYTETYYNSTGKEITTSPQIMDDCYYQGHILN 
EKVSDASISTCRGLRGYFSQGDQRYFIEPLSPI 
HRDGQEHALFKYNPDEKNYDSTCGMDGVL 
WAHDLQQNIALPATKLVTCLKDRKVQEHEKY 
IEYYLVLDNGEFKRYNENQDEIRKRVFEMAN 
YVNMLYKKL>TIWALVGMEIWTDK^KIKIT 
PNASFTLENFSKWRGSVLSRRKRHDIAQLITA 
TELAGTTVGLAFMSTMCSPYSVGWQDHSD 
NLLRVAGTMAHEMGHNFGMFHDDYSCKCPS 
TICVMDKALSFYIPTDFSSCSRLSYDKFFEDKL 
SNCLFNAPLPTDITSTP1CGNQLVEMGEDCDC 
GTSEECTNICCDAKTCKJKATFQCALGECCEK 
CQFKKAGMVCRPAKDECDLPEMCNGKSGNC 
PDDRFQVNGFPCHHGKGHCLMGTCPTLQEQ 
CTELWGPGTEVADKSCYNRNEGGSKYGVCR 
RVDDTLIPCKANDTMCGKLFCQGGSDNLPW 
KGRIVTFLTCKTFDPEDTSQE1GMVANGTKCG 
DNKVCINAECVDIEKAYKSTNCSSKCKGHAV 
CDHELQCQCEEGWIPPDCDDSSWFHFSIWG 
VLFPMAVIFVWAMVIRHQSSREKQKKDQRP 
L STTGTRPHKQKRKPQM VKA VQPQEMSQMK 
PHVYDLPVEGNEPPASFHKDTNAJLPPTVFKD 
NPMSTPKDSNPKA 
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SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 1 

beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine OCysleine, 
D=Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, GKjlycine, H=Histidine, 
I=Isoleucine, K~Lysinc, L-Leucinc, 
M=Methioninc, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, ^Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=posstble 
nucleotide insertion 


950 


2300 


A 


8100 


1 


1251 


MGLLLMCLASAVLGSFLTLLAQFFLLYRRQPb 

PPADEAARAGEGFRYIKPVPGLLLREYLYGG 

GRDEEPSGAAPEGGATPTAAPETPAPPTRETC 

YFLNATILFLFRELRDTALTRRWVTKKIKVEF 

EELLQTKTAGRLLEGLSLRDVFLGETVPFIKTI 

RLVRPWPSATGEPDGPEGEALPAACPEELAF 

EAEVEYNGGFHLAIDVDLVFGKSAYLFVKLS 

R WGRLRL VFTRVPFTHWFF SF VEDPLIDFE V 

RSQFEGRPMPQLTSirVTvIQLKXIIKRKHTLPNY 

KIRFKPFFPYQTLQGFEEDEEHIHIQQWALTE 

GRLKVTLLECSRLLIFGSYDREANVHCTLELS 

SSVWEEKQRSSIKTGTISLTAVFMGWHRVSE 

AFPGLWYKLLVDLPFWGLEDGGPLLTVPLRQ 

CPG 


951 


2301 


A 


8108 


1612 


839 


EVALFCFEMAAGMYLEHYLDS1ENLPFELQR 

NFQLMRDLDQRTEDLKAEIDKLATEYMSSAR 

SLSSEEKLALLKQIQEAYGKCKEFGDDKVQL 

AMQTYEMVDKHniRLDTDLARFEADLKEKQI 

ESSDYDSSSSKGKKKGRTQKEKKAARARSKG 

KNSDEEAPKTAQKKLKLVRTSPEYGMPSVTF 

GSVHPSDVLDMPVDPNEPTYCLCHQVSYGE 

MIGCDNPDCSffiWFHFACVGLTTKPRGKWFC 

PRCSQERKKK 


952 


2302 


A 


8112 


595 


291 


PSVASLARRFSGRALWPPSHSVPGNRALCPRL 
LHGTTLPGGNQRELARQKNMKJCQSDSVKGK 
RRDDGLSAAARKQRDSTPRDSELMQQKQKK 

ANEKKEEPK 


953 


2303 


A 


Q 1 1 8 

olio 


i 


669 


VCAGIRDPCSTPLAKPAAGGAENLSFGKQPG 

LETNILKMTTPNKTPPGADPKQLERTGTVREI 

GSQAVWSLSSCKPGFGVDQLRDDNLETYWQ 

SDGSQPHLVNIQFRRKTTVKTLCIYADYKSDE 

SYTPSKISVRVGNNFHNLQEIRQLELVEPSGW 

EHWLTDNHKJCPTRTFMIQIAVLANHQNGRD 

THMRQIKrYTPVEESSIGKFPRCrriDmMYRS 

IR 


954 


2304 


A 


8133 


66 


1015 


PPLPPRSFPNLFSRPEPLPEPGRRGCNRSKIiKA 

ARAPSPPPPFEGAPGRAMVKVTFNSAJLAQKE 

AKKDEPKSGEEALIIPPDAVAVDCKDPDDW 

PVGQRRAWCWCMCFGLAFMLAGVILGGAY 

LYKYFALQPDDVYYCGIKY1KDDVILNEPSAD 

APAALYQTIEENIKIFEEEEYEFISVP\TEFADS 

DPANrVHDFNKKXTAYLDLNLDKCYVIPLNT 

SIVMPPRNLLELLINIKAGTYLPQSYLIHEHMV 

ITDRIENIDHLGFFTYRLCHDKETYXXQRRETI 

KGIOKREASNCFAIRHFENKFAVETLICS 


955 


2305 


A 


8143 


35 


1171 


"VESRSAWHEGEDQIDRLDFIRNQMNLL 1LU v 
KKKIKEVTEEVANKVSCAMTDEICRLSVLVD 
EFCSEFHPNPDVLKIYKSELNKHIEDGMGRNL 
ADRCTDEVNALVLQTQQEIIENLKPLLPAGIQ 
DKLHTLIPCKKFDLSYNLNYHKLCSDFQEDIV 
FRFSLGWSSLVHRFLGPRNAQRVLLGLSEPIF 
QLPRSLASTPTAPTTPATPDNASQEELMITLVT 
r;i AQVTSRTSMGIirVGGVIWKTlGWKLLSVS. 
LTMYGALYLYERLSWTTHAKERAFKQQFVN 
YATEKLRMIVSSTSANCSHQVKQQIATTFARL 
CQQVDITQKQLEEE1ARLPKEIDQLEK1QNNS 
KLLRNKAVQLENELENFTKQFLPSSNEES 


956 


2306 


A 


8157 


1854 


798 

_j . 


ASGSPAPSSSSAMAAACGPGAAGYCLLLGLH 
LFLLTAGPALGWNDPDRMLLRDVKALTLHY 
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SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ tD 
NO: of 
peptide 
seq- 
uence 


Met 1 

hod 1 

1 


SEQ 
ID NO: 

,n 

USSN 
09/496 
914 


Predicted 
beginning 

location 
correspond! 
ng to first 
amino acid 
residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
F-Pheny I alanine, OGlycine, H=Histdine, 
I-Isoleucine, K-Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/-possible nucleotide deletion, V=possiblc 
nucleotide insertion 














DR YTTSRRLDPEPQLKC VGGTAGCDS Yl i>K V I 

QCQNKGWDGYDVQWECKTDLD1AYKFGKT 

WSCEGYESSEDQYVLRGSCGLEYNLDYTEL 

GLQKLKESGKQHGFASFSDYYYKWSSADSC 

N^SGLITIVVLLGIAFVVYKIJFLSDGQYSPPP 

YSEYPPFSHRYQRFTNSAGPPPPGFKSEFTGPQ 

NTGHGATSGFGSAFTGQQGYENSGPGFWTGL 

GTGGILGYLFGSNRAATPFSDSWYYPSYPPSY 

PGTWNRAYSPLHGGSGSYSVCSNSDTKTRTA 

SGYGGTRRR 


957 


2307 


A 


8159 


1492 


528 


THVVMTGMCYAPHQVLSYINGVTTSKPGVSL 

VYSMPSRNLSLRLEGLQEKDSGPYSCSVNVQ 

DKQGKSRGHSIKTLELNVLVPPAPPSCRLQGV 

PHVGANVTLSCQSPRSKPAVQYQWDRQLPSF 

QTFFAPALDVIRGSLSLTNLSSSMAGVYVCKA 

HNEVGTAQCNVTLEVSTGPGAAWAGAVVG 

TLVGLGLLAGLVLLYHRRGKALEEPANDIKE 

D AIAPRTLP WPKS SDTISKNGTL S S VTS ARAL 

RPPHGPPRPGALTPTPSLSSQALPSPRLPTTDG 

AHPQPISPIPGGVSSSGLSRMGAVPVMVPAQS 

QAGSLV 


958 


2308 


A 


8161 


2340 


1192 


ELARRPKQQSSEKSRNMIRNWLTIFILFPLKLV 

EKCESSVSLTVPPVVKLENGSSTNVSLTLRPP 

LNATLVTTFEITFRSKNITILELPDEWVPPGVT 

NSSFQVTSQNVGQLTVYLHGNHSNQTGPRIR 

Ft VIRSSAISUNOVIGWIYFVAWSISFYPQVIM 

NWRRKSVIGLSFDFVALNLTGFVAYSVFNIGL 

LWVPYIKEQFLLKYPNGVNPVNSNDVFFSLH 

AWLTLIUVQCCLYERGGQRVSWPAIGFLVL 

AWLFAFVTMIVAAVGVITWLQFLFCFSYIKL 

AVTLVKYFPQAYMNFYYKSTEGWSIGNVLL 

DFTGGSFSLLQMFLQSYNNDQWTLIFGDPTK 

FGLGVFSIVFDWFFIQHFCLYRKRPGYDQLN 


959 


2309 


A 


8163 


521 


1345 


GERAGRRRGRLGVWAQPQPLLPRPVGSRKK 

MQPPGPPPAYAPTNGDFTFVSSADAEDLSGSI 

ASPDVKLNLGGDFIKESTATTFLRQRGYGWL 

LEVEDDDPEDNKPLLEELDIDLKDIYYKIRCV 

LMPMPSLGFNRQVVRDNPDFWGPLAVVLFFS 

MISLYGQFRWSWIITIWTFGSLTIFLLARVLG 

GEVAYGQVLGV1GYSLLPLIVIAPVLLWGSF 

EWSTLIKLFGVFWAAYSAASLLVGEEFKTK 

KPLLIYPIFLLYIYFLSLYTGV 


960 


2310 


A 


8167 


1 


2921 


MTCFKGQKGEQRSHAF'EANKDHKAKVPSPN 

LYSQLNALQFTVDERSILWLNQFLLDLKQSL 

NQFMAVYKLNDNSKSDEHVDVRVDGLMLK 

FVIPSEVKSECHQDQPRAISIQSSEMIATNTRH 

CPNCRHSDLEALFQDFKDCDFFSKTYTSFPKS 

CDNFNLLHP1FQRHAHEQDTKMHEIYKGNITP 

QLNKNTLKTSAATDVWAVYFSQFWIDYEGM 

KSGKGRPISFVDSFPLSIWICQPTRYAESQKEP 

QTCNQVSLNTSQSESSDLAGRLKRKKLLKEY 

YSTESEPLTNGGQKPSSSDTFFRFSPSSSEADI 

HLLVHVHKHVSMQINHYQYLLLLFLHESL1LL 

SENLRKDVEAVTGSPASQTSICIGLLLKbAiiLA 

LLLHPVDQANTLKSPVSESVSPWPDYLPTEN 

GDFLSSKJOCQISRDINRIRSVTVNHMSDNRSM 

SVDLSHIPLKDPLLFKSASDTNLQKGISFMDY 

LSDKHLGKJSEDESSGLVYKSGSGEIGSETSD 

KJCDSFYTDSSSVLNYREDSNILSFDSDGNQNI 

LSSTLTSKGNETOSIFKAEDLLPEAASLSENL 
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SEQID 
NO: of ' 
nucl- 
eotide 
seq- 
uence 


SEQID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
[DNO: 

USSN 
09/496 
914 


Predicted 

beginning 

lucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Aiamne C=Cysteme, 
D=Aspartic Acid, E^Glutamic Acid, 
^Phenylalanine, G-Glycine, H-Histidinc, 
Msoleurinc, KHLysine, L=Lcucine, 
M=Methioninc, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W^Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/-possible nucleotide deletion, V-possible 
nucleotide insertion 














DISKEETPPVRTLKSQSSLSGKPKERCPPNLAP 

LCVS YKNMKRSSSQMSLDTI SLDSMILEEQLL 

ESDGSDSHMFLEKGNKKNSTTNYRGTAESVN 

AGANLQNYGETSPDAISTNSEGAQENHDDLM 

SVVVFKITGVNGEIDERGEDTEICLQVNQVTP 

DQLGNISLRHYLCNRPVGSDQKAVIHSKSSPE 

ISLRFESGPGAVTHSLLAEKNGFLQCHIENFST 

EFLTSSLMN1QHFLEDETVATVMPMKIQVSNT 

KINLKDDSPRS STVSLEP AP VTVHIDHL VVER 

SDDGSFHIRDSHMLNTGNDLKENVKSDSVLL 

TSGKYDLKKQRSVTQATQTSPGVPWPSQSAN 

FPEFSFDFTREQLMEENESLKQELAKAKMAL 

AEAHLEKDALLHHIKKMTVE 


961 


2311 


A 


8172 


1 A A") 

1442 




TAAMSIFTPTNQIRLTNVAVVRMKRAGlCK^m 

ACYKNKWGWRSGVEKDLDEVLQTHSVFVN 

VSKGQVAKKEDLISAFGTDDQTEICKQILTKG 

EVQVSDKERHTQLEQMFRDIATIVADKCVNP 

ETKRPYTV1LERAMKDIHYSVKTNKSTKQQA 

LEVIKQLK£KMK1ERAHMRLRF1LPVNEGK1CL 

KEKLKPLIKVIESEDYGQQLEIVCUDPGCFREI 

DELEKKETKGKGSLEVLNLKDVEEGDEKFE 


962 


2312 


A 


8175 


286 


587 


NISNKAEVSSHPSV1SHSMDSFGQPRPEDNQS 

VLRRMQKKYWKTKQVFIKATGKKEDEHLVA 

SDAELDAKLEVFHSVQETCTELLKiCEKYQLR 

I NGMKS 


963 


2313 


A 


8181 


13 


2215 


' AEGCAERRGTEP WELSMS WESGAGPGLUSg 
GMDLV W S A W YGKCVKGKGSLPLS AHGIW 
AWLSRAEWDQVTVYLFCDDHKLQRYALNRJ 
TWRSRSGNELPLAVASTADLIRCKLLDVTG 
GLGTDELRLLYGMALVRFVNLISERKTKFAK 
VPLKCI^QEVNIPDWIVDLRHELTHKKMPHI 
NDCRRGCYFVLDWLQKTYWCRQLENSLRET 
WELEEFREGIEEEDQEEDKNIWDDITEQKPE 
PQDDGKSTESDVKADGDSKGSEEVDSHCKK 
ALSHKELYERARELLVSYEEEQFTVLEKFRYL 
PKAIKAWNNPSPRVECVLAELKGVTCENREA 
VLDAFLDDGFLVPTFEQLAALQIEYEENVDL 
NDVLVPKPFSQFWQPLLRGLHSQNFTQALLE 
RMLSELPALGISGIRPTYlLR^TVELrVANTKT 
GRNARRFSAGQWEARRGWRLFNCSASLDWP 
RMVESCLGSPCWASPQLLREFKAMGQGLPD 
EEQEKLLRICSIYTQSGENSLVQEGSEASPIGK 
SPYTLDSLYWSVKPASSSFGSEAKAQQQEEQ 
GSVNDVKEEEKEEKEVLPDQVEEEEENDDQE 
EEEEDEDDEDDEEEDRMEVGPFSTGQESPTA 
ENARLLAQKRGALQGSAWQVSSEDVRWDTF 
PLGRMPGQTEDPAELMLENYDTMYLLDQPV 
LEQRLEPSTC KTDTLGLSCG VG S GNC SN SSSS 
NFEGLLWSOGQLHGLKTGLQLF 


964 


2314 


A 


8184 


6 


1393 


™ EPRRNFRDD STRPRTRGRTRGRRRRACRb At 
GTGLRSLLLPPRLQLPAGPFSRCRWDPVSSPR 
PSTMPPKKGGDG1KPPPI1GRFGTSLKJGIVGLP 
NVGKSTFFNVLTNSQASAENFPFCTIDPNESR 

■i rr>\ rnnUD PTiPT PO VHK P A S K [P A FLNWDIAG 

LVKGAHNGQGLGNAFLSHISACDGIFHLTRA 

FEDDDfTHVEGSVDPERDIEIIHEELQLKDEEMI 

GPUDKLEKVAVRGGDKKLKPEYDLMCKVKS 

WVIDQKJCPVRFYHDWNDKEIEVLNKHLFLTS 

KP\1VYLVNXSEKI)YIRJCKNKWLIKIK£WVD 

KYDPGALVTPFSGALELKLQELSAEERQKYLE 
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seq- 
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nucleotide 
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correspondi 

ng to first 

amino acid 
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peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanme OCysteine, 1 
D= Aspartic Acid, E=Glutam ic Acid, 1 
^Phenylalanine, <X31ycinc, H-Histidinc, 
Msoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \~possible 
nucleotide insertion 














ANMTQSALPKJIKAGFAALQLEYFFTAGPUtV 
RAWTIRKGTKAPQAAGKJHTDFEKGF1MAEV 
MKYEDFKEEGSENAVKAAGKYRQQGRNYIV 
EDGDIIFFKFNTPQQPKKK 


965 


2315 


A 


8195 


1437 


594 


RSFSLSFSLLSPSEMMALGAAGATRVFVAMV 

AAALGGHPLLGVSATLNSVLNSNAIKNLPPPL 

GGAAGHPGSAVSAAPGILYPGGNKYQT1DNY 

QPYPCAEDEECGTDEYCASPTRGGDAGVQIC 

LACRKRRKRCMRHAMCCPGNYCKNGICVSS 

DQNHFRGEIEETrTESFGNDHSTLDGYSRRTT 

LSSKMYHTKGQEGSVCLRSSDCASGLCCARH 

FWSKJCKP VLKEGQVCTKHRRK,G SHGLEIFQ 

RCYCGEGLSCRIQKDHHQASNSSRLHTCQRH 


966 


2316 


A 


8207 


416 


4082 \ 


KFKL IKIMLLTLnLLPVVSKFSFVSLSAPQHW 

SCPEGTLAGNGNSTCVGPAPFLIFSHGNSIFRJ 

DTEGTNYEQLWDAGVSVIMDFHYNEKRIY 

WVDLERQLLQRVFLNGSRQERVCNIEKNVSG 

MAINWINEEVIWSNQQEGIITVTDMKGNNSHI 

LLSALKYPANVAVDPVERF1FWSSEVAGSLY 

RADLDGVGVKALLETSEKITAVSLDVLDKRL 

FWIQYNREGSNSL1CSCDYDGGSVHISKHPTQ 

HNLFAMSLFGDRIFYSTWKMKTIWIANKHTG 

KDMVRINLHSSFVPLGELKVVHPLAQPKAED 

DTWEPEQKLCKLRKGNCSSTVCGQDLQSHLC 

MCAEGYALSRDRKYCEGNDWKYCEDVNEC 

AFWNHGCTLGCKNTPGSYYCTCPVGFVLLPD 

GKRCHQLVSCPRNVSECSHDCVLTSEGPLCF 

CPEGSVLERDGKTCSGCSSPDNGGCSQLCVPL 

SPVSWECDCFPGYDLQLDEKSCAASGPQPFL 

LFANSQDIRHMHFDGTDYGTLLSQQMGMVY 

ALDHDPVENKIYFAHTALKWIERANMDGSQ 

RERLIEEG VD VPEGL A VD W IG RRF Y WTDRGK 

SLIGRSDLNGKRSKHTIEN1SQPRGIAVHPMAK 

RLFWTDTGINPRIESSSLQGLGRLV1ASSDLIW 

PSGITIDFLTDKLYWCDAKQSVIEMANLIXjSK 

RRRLTQND VGHPF A V A VFED Y V WF SD W AMP 

SVIRVNKRTGKDRVRLQGSMLKPSSLVWHP 

LAKPGADPCLYQNGGCEHICKJCRLGTAWCS 

CREGFMKASDGKTCLALDGHQLLAGGEVDL 

KNQVTPLDILSKTRVSEDN1TESQHMLVAEIM 

VSDQDDCAPVGCSMYARQSEGEDATCQCLK 

GFAGDGKLCSDIDECEMGVTVCPPASSKCINT 

EGGYVCRCSEGYQGDGIHCLDIDECQLGVHS 

CGENASCTNTEGGYTCMCAGRLSEPGLICPD 

STPPPHLREDDHHYSVRNSDSECPLSHDGYCL 

HDGVCMYIEALDKYACNCVVGYIGERCQYR 

DLKWWELRHAGHGQQQKVIWAVCVVVLV 

MLLLLSLWGAHYYRTQKLLSKNPKNPYEESS 

RDVRSRRPADTEDGMSSCPQPWFWIKEHQD 

LKNGGQPVAGEDGQAADGSMQPTSWRQEPQ 
LCGMGTEQGCWIPVSSDKGSCPQVMERSFH 
MPSYGTQTLEGGVEKPHSLLSANPLWQQRAL 
DPPHQMELTQ 


967 


2317 


A 


8210 


3 


601 


RLHHRFRALDRKKKGYLSRMDLQQIGALAV 

NPLGDRIIESFFPDGSQRVDFPGFVRVLAHFRP 

VEDEDTETQDPKKPEPLNSRRNKLHYAFQLY 

DLDRDGK1SRHEMLQVLRLMVGVQVTEEQL 

ENIADRTVQEADEDGDGAVSFVEFTKSLEKM 

DVEHKMSDULK 
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SEQBD 
NO: of 
nucl- 
eotide 
seq- 
uence 



968 



969 



970 



SEQED 
NO: of 
peptide 
seq- 
uence 



Met 
hod 



2318 



2319 



2320 



SEQ 
ED NO: 
in 

USSN 
09/496 
914 



Predicted 

beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 



8211 



8215 



971 



2321 



8216 



8217 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 



409 



1938 



1235 



2223 



3274 



Amino acid sequence (A= Alanine C=Cysteine, 
I>Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, G-Glycine, H=Histidine, 
Msolcucine, K=Lysinc, L^Lcucine, 
M=Mcthionine, N=AsparagiDc, P=Prolinc, 
QKjlutamine, R=Arginine, S^Serine, 
T=Threonine, V^Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, possible 

nu cleotide insertion 

ISSCPrTTAYEGSMSTLSNFTQTLEDVFRRlhir 
YMDNWRQNTTAEQEALQAKVDAENFYYVIL 
YLNIVMIGN^SFIIVAILVSTVKSKRREHSNDP 
YHQYTVEDWQEKYKSQ1LNLEESKATTHEN1G 

AA GFKMSP . 

GMPRSRGGRAAPGPPPPPPPPGQAPRWSRWR 

VPGRLLLLLLPALCCLPGAARAAAAAAGAGN 

RAAVAVAVARADEAEAPFAGQNWLKSYGY 

LLPYDSRASALHSAKALQSAVSTMQQFYGIP 

VTGN^DQTTIEWMKKPRCGVPDHPHLSRRRR 

NKRYALTGQKWRQKHITYSIHNYTPKVGELD 

TRKAIRQAFDVWQKVTPLTFEEVPYHEIKSDR 

KEADrMlFFASGFHGDSSPFDGEGGFLAHAYF 

PGPG1GGDTHFDSDEPWTLGNANHDGNDLFL 

VAVHELGHALGLEHSSDPSAIMAPFYQYMET 

HNFKLPQDDLQGIQKIYGPPAEPLEPTRPLPTL 

PVRRIHSPSERKHERQPRPPRPPLGDRPSTPGT 

KPNICDGNFNTVALFRGEMFVFKDRWFWRL 

RMNRVQEGYPMQIEQFWKGLPARIDAAYER 

AIXjRFVFFKGDKYWVFKEVTVEPG YPHS LG 

ELGSCLPREGIDTALRWEPVGKTYFFKGERY 

WRYSEERRATDPGYPKPITVWKGIPQAPQGA 

F1SKEGYYTYFYKGRDYWKFDNQKLSVEPGY 

PRNILRDWMGCNQKEVERRKERRLPQDDVDI 

MVTINDVPGSVNAVAWIPCILSLCILVLVYT1 

FQFTCNTKTGPQPVTYYKRPVQEWV 

SRLSLQFYVSFRRTGLFTCKLIVEIFFRNYMN 

D S LRTNVF VRFQPETI ACACrYL AARALQ IPLP 

TRPHWFLLFGTTEEE1QEICIETLRLYTRKKPN 

YELLEKEVEKRKVALQEAKLKAKGLNPDGTP 

ALSTLGGFSPASKPSSPREVKAEEKSPISrNVK 

TVKKEPEDRQQASKSPYNGVRKDSKRSRNSR 

SASRSRSRTRSRSRSHTPRRHYNNRRSRSGTY 

S SRSRSRS RSHSESPRRHHNHG SPHLKAKHTR 

DDLKSSKRHGHKRKKSRSRSQSKSRDHSDAA 

KJCHRHERGHHRDRRERSRSFERSHKSKHHGG 

SRSG HGRHRR 

DCRLQAAMPTNFTWPVEAHADGGGDE 1 At 

RTEAPGTPEGPEPERPSPGDGNPRENSPFLNN 

VEVEQESFFEGKmiALFEEEMDSNPMVSSLL 

NKLANYTNLSQGVYEHEEDEESRRREAKAPR 

MGTFIGVYLPCLQNILGVILFLRLTWIVGVAG 

VLESFLIVAMCCTCTMLTAISMSAIATNGVVP 

AGG S YYMISRSLGPEFGG A VGLCF YLGTTFA 

GAMYILGTIEIFLTYISPGAADFQAEAAGGEAA 

AMLHNMRVYGTCTLVLMALVVFVGVKYVN 

KLALVFLACWLSILAIYAGVIKSAFDPPDIPV, 

CLLGNRTLSRRSFDACVKAYGIHMNSATSAL 

WGLFCNGSQPSAACDEYFIQNNVTEIQGIPGA 

ASGVFLENLWSTYAHAGAFVEKKGVPSVPV 

AEESRASTLPYVLTDIAASrTLLVGIYFPSVTG 

IMAGSNRSGDLKDAQKSIPTGTILAI\TTSnY 

LSCrVLFGACTEGWLRDKFGEALQGNXVIGM 

LAWPSPWVIV1GSFFSTCGAGLQTLTGAPRLL 

QAIARDGIVPFLQVFGHGKANGEPTWALLLT 

VLICETGIL1ASLDSVAPILSMFFLMCYLFVNL 

ACAVQTLLRTPNWRPRFKFYHWTLSFLGMSL 

CLALMFICSWYYALSAMLIAGCIYKY1EYRG 

AEKEWGDGIRGLSLNAARYALLRVEHGPPHT 

KNWTIPQVLVMLNLDAEQAMKHPRLLSFTSQ 
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SEQ ID * 
NO: of > 
nucl- F 
cotidc s 
seq- i 
uencc 


»EQ ID N 
40: of fa 
>eptide 
-eq- 
lence 


4et 5 
od I 

I 
( 
c 


>EQ ! f 
DNO: t 
n r 
JSSN 1 
)9/496 c 
M i 
i 

I 

1 


redicted I 
>eginning i 
uclcotide 
ocation < 
correspond i 
\g to first 
imino acid 
csidue of 
peptide 
sequence 


Predicted end j 
lucleotide 
ocation 
;orrcsponding 
to last amino 
acid residue 
of peptide 
sequence 


Mnino acid sequence (A=Alarune OCysteine, 
>Aspartic Acid, EK>lutamic Acid, 
r ^Phenylalanine, G-Glycine, FHHistidine, 
[=Isoleucine, K=Lysinc, L«=Lcucine, 
^Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V-Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, \-possible 
nucleotide insertion 














LKAGKGLTIVGSVLBU 1 YLDKHMEAgRAtE 

N1RSLMSTEKTKGFCQLWSSSLRDGMSHLIQ 

SAGLGGLKHNTVLMAWPASWKQEDNPFSW 

KNFVDTVRDTTAAHQALLVAKNVDSFPQNQ 

ERFGGGHIDVWWIVHDGGMLMLLPFT.LRQH 

KVWRKCRMRIFTVAQVDDNSIQMKKDLQMF 

LYHLRISAEVEVVEMVENDISAFTYERTLMM 

EQRSQMLKQMQLSKNEQEREAQLIHDRNTAS 

HTAAAARTQAPPTPDKVQMTWTREKLIAEK 

YRSRDTSLSGFKDLFSNOCPDQSNVRRMHTAV 

KLNGWLNKSQDAQLVLLNMPGPPKNRQGD 

FMYMEFLEVLTEGLNRVLLVRGGGREVITIYS 


972 


2322 


A 


8224 


701 r 


246 


TSRRVTMKFNPFVTSDRSKNRJOlHFNAFbi-tv 
RRK1MSSPLSKELRQKYNVRSMP1RKDDEVQ 
WRGHYKGQQIGKWQVYRKKYVIYIERVQ 
REKANGTTVHVGIHPSKWITRLKLDKDRKKI 
1 ,ER KAK SROV GKEKGK YKEELIEKMQE 


973 


2323 


A 


8237 


873 


4610 


"g^hagMgrvptggltggrtwspsaaprsc 

PRPGPTP APG AMDKLPPSMRKRL YSLPQQ VG 

AKAWIMDEEEDAEEEGAGGRQDPSRRSIRLR 

PLPSPSPSAAAGGTESRSSALGAADSEGPARG 

AGKSSTNGDCRRFRGSLASLGSRGGGSGGTG 

SGSSHGHLHDSAEERRL1AEGDASPGEDRTPP 

GLAAEPERPGASAQPAASPPPPQQPPQPASAS 

CEOPSVDTAIK.VEGGAAAGDQILPEAEVRLG 

OAGFMQRQFGAMLQPGVNKFSLRMFGSQKA 

VEREOERVKSAGFW1IHPYSDFRFYWDLTML 

LLMVGNLIIIPVGITFrTCDErmTWIVr^WSD 

TFFLIDLVLNFRTGIVVEDNTEIILDPQRIKMK 

YLK S WFMVDFI S SIP VD Yl FLIVETRID SE V YK 

T ARALRJ VRFTKXLSLLRLLRLSRLIRYIHQ WE 

EIFHMTYDLASAVVRIVr^IGMMLLLCHWDG 

CLOFLVP^QDFPDDCWVSINNMVNNSWGK 

OYSYALFKAMSHMLCIGYGRQAPVGMSDV 

WLTMLSMIVGATCYAMFIGHATALIQSLDSS 

RRQYQEKYKQVEQYMSFHKLPPDTRQRIHD 

YYEHRYQGKMFDEESILGELSEPLREEI1NFNC 

RKLVASMPLFANADPNFVTSMLTKLRFEVFQ 

PGDYHREGTIGKKMYFIQHGVVSVLTKGNKE 

TKLADG SY7GE1CLLTRGRRTAS VRADTYCR 

LYSLSVDNTOEVLEEYPMMRRAFETVALDRL 

DRIGKKNSILLHKVQHDLNSGVFNYQENEUQ 

OIVQHDREMAHCAHRVQAAASATPTPTPVIW 

TPLIQAPLQAAAATTSVA1ALTHHPRLPAAIFR 

PPPGSGLGNLGAGQTPRHLKRLQSLIPSALGS 

ASPASSPSQVDTPS SSSFFUQQLAGFS APAGLS 

PLLPSSSSSPPPGACGSPSAPTPSAGVAATT1A 

GFGHFHKALGGSLSSSDSPLLTPLQPGARSPQ 

AAOPSPAPPGARGGLGLPEHFLPPPPSSRSPSS 

SPGQLGQPPGELSLGL^GPLS™ 

PSLVAGASGGASPVGFTPRGGLSPPGHSPGPP 

RTFPSAPPRASGSHGSLLLPPASSPPPPQVPQR 

RGTPPLTPGRLTQDLKLISASQPALPQDGAQT 

LRRASPHSSGESMAAFPLFPRAGGGSGGSGSS 

GGLGPPGRPYGA1PGQHVTLPRKTSSGSLPPP 

LSLFGARATSSGGPPLTAGPQREPGARPEPVR 

SKLPSNL 


974 


2324 


A 


8247 


279 


468 


EYKQVv^RRFLSCQNRNDLGYGKPRKUOGLL 
LVPVKDASRICSLTYLLGSHWNNLWRSPVL 

G 
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SEQED 1 b 
NO: of j r 
nucl- ', F 
cotidc s 
seq- i 
ucnce 


EQ ID * 
40: of r 
>eptide 
cq- 
]encc 


tod I 
i 
1 
( 
< 


>EQ F 
DNO: b 
n r 
JSSN 1 
)9/496 c 
H4 i 

3 

} 


Predicted 1 
>eginning t 
uclcotidc 
ocation < 
;orrespondi 1 
lg to first 
imino acid 
residue of 
peptide 
sequence 


> redicted end J 
mcleotide 
ocation 
corresponding 
,o last amino 
acid residue 
of peptide 
sequence 


Vmino acid sequence (A=Alamne C-Cysteine, 
>Aspartic Acid, E=Glutarnic Acid, 
r-Phenylalanine, G-Glycine, H==Histidine, 
=Isolcucine, K=Lysine, L=Leucinc, 
M-Methionine, N^Asparagine, P-Proline, 
Q=Glutamine, R=Arginine, S=Senne, 
T=Threonine, V= Valine, W=Tryptophan, 
Y-Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possibie 
nucleotide insertion 


975 


2325 


A 


8249 


62 


1571 


LV ALKN WKPKGTN1P APQSP VFGEAV bti V YM 

MTKVLGMAPVXGPRPPQEQVGPLMVKVEEK 

EEKGKYLPSLEMFRQRFRQFGYHDTPGPREA 

LSQLRVLCCEWLRPEIHTKEQILELLVLEQFLT 

ILPQELQAWVQEHCPESAEEAVTLLEDLEREL 

DEPGHQVSTPPNEQKPVWEKJSSSGTAKESPS 

SMQPQPLETSHKYESWGPLYIQESGEEQEFAQ 

DPRKVRDCRLSTQHEESADEQKGSEAEGLKG 

D1ISVI1ANKPEASLERQCVNLENEKGTKPPLX3 

EAGSKKGRESVPTKPTPGERRYICAECGKAFS 

NSSNLTKHRRTHTGEKPYVCTKCGKAFSHSS 

NLTLHYRTHLVDRPYDCKCGKAFGQSSDLLK 

HQRMHTEEAPYQCKDCGKAFSGKGSIJRHYR 

IHTGEKPYQCNECGICSFSQHAGLSSHQRLHT 

GEKPYKCKJECGKAFNHSSNFNKHHRIHTGEK 

PYWCHHCGKTFCSKSNLSKHQRVHTGEGEA 

p 


976 


2326 


A 


8257 


298 


7086 


GNMACWPQLRLLLWKNLTFRRRQTCQLLLE 

VAWPLFIFLILISVRLSYPPYEQHECHFPNKAM 

PSAGTLPWVQGI1CNANNPCFRYPTPGEAPGV 

VGNFNKSIVARLFSDARRLLLYSQKDTSMKD 

MRXVLRTLQQIKKSSSNLKLQDFLVDNETFS 

GFLYHNLSLPKSTVDKMLRADVILHKVFLQG 

YQLHLTSLCNGSKSEEMIQLGDQEVSELCGLP 

REKLAAAERVLRSNMDILKPILRTLNSTSPFPS 

KELAEATKTLLHSLGTLAQELFSMRSWSDMR 

QEVMFLTKVNSSSSSTQIYQAVSRIVCGHPEG 

GGLKIKSLNWYEDNNYKALFGGNGTEEDAE 

TFYDNSTTPYCNDLMXNLESSPLSRIIWKALK 

PLLVGKILYTPDTPATRQVMAEVNKTFQELA 

VFHDLEGMWEELSPKIWTFMENSQEMDLVR 

MLLDSRDNDHFWEQQLDGLDWTAQDIVAFL 

AKHPEDVQSSNGSVYTWREAFNETNQAIRTIS 

RFMECVNLNKLEPIATEVWLINKSMELLDER 

KFWAGIVFTGITPGSIELPHHVKYKIRMGIDN 

VERTNKIKDGYWDPGPRADPFEDMRYVWGG 

FAYLQDVVEQAI1RVLTGTEKKTGVYMQQMP 

YPCYVDDIFLRVMSRSMPLFMTLAWIYSVAV 

IIKGIVYEKEARLKETMRIMGLDNSILWFSWFI 

SSLIPLLVSAGLLWILKLGNLLPYSDPSWFV 

FLSVFAWTlLQCFLISTLFSRANLAAACGGn 

YFTLYLPYVLCVAWQDYVGFTLKIFASLLSP 

V AFGFGCE YFALFEEQG1GVQWDNLFESPVE 

EDGFNLTTSVSMMLFDTFLYGVMTWYIEAVF 

PGOYGIPRPWYFPCTKSYWFGEESDEKSHPGS 

NQKRISEICMEEEPTHLKLGVSIQNLVKVYRD 

GMKVAVDGLALNFYEGQITSFLGHNGAGKT 

TTMSILTGLFPPTSGTAYILGKDIRSEMSTIRQ 

NLG VCPQHNVLFDMLTVEEHIWFYARLKGLS 

EKHVKAEMEQMALDVGLPSSKLKSKTSQLS 

GGMQRKLS V ALAFV GGSKV VILDEPTAGVDP 

YSRRGIWELLLKYRQGRTLILSTHHMDEADVL 

GDR1AIISHGKLCCVGSSLFLKNQLGTGYYLT 

LVKKDVESSLSSCRNSSSTTV r SYLKKEDSVSQS 

SSDAGLGSDHESDTLT1DVSAISNLIRKHVSEA 

RLVED1GHELTYVLPYEAAKEGAFVELFHEID 

DRLSDLGISSYGISETTLEE1FLKVAEESGVDA 

ETSDGTLPARRNRRAFGDKQSCLRPFTEDDA 

ADPNDSDEDPESRETDLLSGMDGKGSYQVKG 

WKLTQQQFVALLWKRLLLARRSRKGFFAQIV 
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SEQ ID lb 
NO: of r> 
nucl- P 
cotidc s 
seq- u 
ucnce 


EQID N 
JO: of h 
cptide 
eq- 
ence 


let S 
od I 
ii 
I 

C 
S 


EQ P 

DNO: b 
i n 

;ssn i 

19/496 c 
>14 r 

£ 
[ 

1 


redicted F 
eginning r 
ucleotide 1 
ocation c 
orrespondi t 
ig to first i 
unino acid < 
esidue of 
peptide 
sequence 


•redicted end I 
lucleotide 1 
ocation 1 
corresponding ; 
o last amino 1 
icid residue < 
Df peptide 
sequence 


Vmino acid sequence (A=Aiamne C-Cysteine, 
)=Aspartic Acid, E=Glutamic Acid, 
•-Phenylalanine, G=Glycine, H=Histidine, 
*Isoleucine, K=Lysine, L=Uucine, 
Vl=Methionine, N=Asparaginc, P=Prolme, 
3=Glutamine, R^Arginine, S=Serine. 
^Threonine, V=Valine, W-Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/=*possible nucleotide deletion, \=possible 
nucleotide insertion — — 














LPAVFVC1ALVFSL1 VPPFGK YPSLELQFWMY 

NEQYTFVSNDAPEDTGTLELLNALTKDPGFG 

TRCMEGNPrPDTPCQAGEEEWTTAPVPQTIM 

DLFQKGKWTMQNPSPACQCSSDKIKKMLPV 

CPPGAGGLPPPQRKQNTADILQDLTGRNISDY 

LVKTYVQnAKSLKNKIWVNEFRYGGFSLGVS 

KTQALPPSQEVNDATKQMKKHLKLAKDSSA 

DRFLNSLGRPMTGLDTRNNVKVWFNNKGW 

HA1SSFLNVINNAILRANLQKGENPSHYGITAF 

NHPLNLTKQQLSEVAPMTTSVDVLVSICVIFA 

MSFVP ASF WFLIQERVSKAKHLQFI SGVKP VI 

YWLSNFVWDMCNYWPATLVniFlCFQQKSY 

VSSTr^PVLALLLLLYGWSITPLMYPASFVFK 

IPSTAYVVLTSVNLFIGINGSVATFVLELFTDN 

K^^IhTOILKSVFLIFPHFCLGRGLlDMVKNQ 

AMADALERFGENRFVSPLSWDLVGRNLFAM 

AVEGVVFFLITVL1QYRFF1RPRPVNAKLSPLN 

DEDEDVRRERQRILDGGGQNDILEIKELTKIY 

RKKRiCFAVDRICVGIPPGECFGLLGVNGAGK 

SSTFICMLTGDTTVTRGDAFLNRNSILSNIHEV 

HONMGYCPQFDA1TELLTGREHVEFFALLRG 

VPEKEVGKVGEWAIRKLGLVKYGEKYAGNY 

SGGNKRKLSTAMALIGGPPWFLDEmGMD 

PKARRFLWNCALSWKEGRSWLTSHSMEEC 

EALCTRMAIMVNGRFRCLGSVQHLKNRFGD 

GYTIVVRIAGSNPDLKPVQDFFGLAFPGSVPK 

EKHRNMLQYQLPSSLSSLARIFSILSQSKKRLH 

1EDYSVSQTTLDQVFVNFAKDQSDDDHLKDL 

SI HKNOTWDVAVLTSFLODEKVKESYV 


977 


2327 


A 


8260 


3 


1567 


IPGSTISFSLCFlFPPCVn'MVRKPVV^UbRUO 
YLQGNVNGRLPSLGNKEPPGQEKVQLKRKV 

TLLRGVSIIIGTII GAGIFISPKGVLQNTGS VGM 

SLTIWTVCGVLSLFGALSYAELGTTIKKSGGH 

YTYILE VFGPLP AF VKV WVELLIIRP AAT A VI S 

LAFGRYILEPFFIQCEIPELAIKLITAVGITWM 

VLNSMSVSWSARIQIFLTFCKLTAILIIIVPGV 

MQLIKGQTQNFKDAFSGRDSS1TRLPLAFYYG 

MYAYAGWFYLNFVTEEVENPEKTIPLAICISM 

AIVnGYVLTNVAYFTTINAEELLLSNAVAVT 

FSERLLGNFSLAVPIFVALSCFGSMNGGVFAV 

SRLFYVASREGHLPEILSMIHVRKHTPLPAVIV 

LHPLTM1MLFSGDLDSLLNFLSFARWLFIGLA 

VAGLIYLRYKCPDMHRPFKVPLFIPALFSFTC 

LFMV ALSLYSDPFSTGIGFVITLTG VPAYYLFII 

wn^KPRWFRIMSEKITRTLQIILEVVPEEDKL 


978 


2328 


A 


8261 


2 


2165 


" RGG'SLRCVLGKLLGQLLCrQSERCVRFPbOl.L 
RHRGCGLLSSRLSAGKPPLRTSFFGSWGVLPP 
LADAASMSGVRAVRISIESACEKQVHEVGLD 
GTETYLPPLSMSQNLARLAQRIDFSQGSGSEE 
EEAAGTEGDAQEWPGAGSSADQDDEEGWK 
FQPSLWPWDSVRNNLRSAJLTEMCVLYDVLS1 
VRDKKFMTLDPVSQDALPPKQNPQTLQLISK 
KKSLAGAAQILLKGAERLTKSVTENQENKLQ 
RDFNSELLRLRQHWKLRXVGDKILGDLSYR5 
AGSLFPHHGTFEV1KNTDLDLDKKIPEDYCPL 
DVQIPSDLEGSAY1KVSIQKQAPDIGDLGTVN 
LFKRPLPKSKPGSPHWQTKLEAAQNVLLCKEI 
FAOLSREAVQIKSQVPHIWKNQnSQPFPSLQ 
LSISLCHSSNDKJCSQKFATEKQCPEDHLYVLE 
HNLHLLIREFHKQTLSSIMMPHPASAPFGHKR 
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SEQID 5 
NO: of 1 
nucl- I 
cotidc ! 

SCO- < 

uencc 


5EQID i 
^0: of r 
peptide 
seq- 
uence 


Act $ 
lod 1 

i 
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>EQ I 

DMO: \ 

11 I 

JSSN 
39/496 
914 


Predicted 1 
beginning I 
lucleotide 
ocation 
xnTcspondi 
ng to first 
amino acid 
residue of 
peptide 
sequence 


Predicted end 
lucleotidc 
ocation 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Mnino acid sequence (A=Alanine C=Cysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, G=Glycine, H==Histidine, 
i=Isoleucine, K=Lysine, L=Lcucine, 
M=Methioninc, N«Asparaginc, P~Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V= Valine, W-Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, V=possible 
nucleotide insertion 














MRLSGPQAFDKNEINSLQSSEGLLEK11KQAK 

HIFLJISRAAATIDSLASRIEDPQIQAHWSNIND 

VYESSVKVLITSQGYEQICKS1QLQLNIGVEQI 

RWHRDGRVTTLSYQEQELQDFLLSQMSQHQ 

VHAVQQLAKVMGWQVLSFSNHVGLGPIES1G 

NASAITVASPSGDYAISVRNGPESGSKIMVQF 

PRNQCKDLPKSDVLQDNKWSHLRGPFKEVQ 

WNKMEGRNFVYKMELLMSALSPCLL 


979 


2329 


A 


8289 


2 


1053 


F VWNPRGGRKRRRQAA V'J QAATRASU'l t*St> 

RDGTMTQGKLSVANKAPGTEGQQQVHGEKK 

EAPAVPSAPPSYEEATSGEGMKAGAFPPAPTA 

VPLHPSWAYVDPSSSSSYDNGFPTGDHELFTT 

FSWDDQKVRRVFVRKVYTILLIQLLVTLAVV 

ALFTFCDPVXDYVQANPGWYWASYAVFFAT 

YLTLACCSGPRRHFTWNLI1XTVFTLSMAYLT 

GMLSSYYNTTSVLLCLGITALVCLSVTVFSFQ 

TKFDFTSCQGVLFVLLMTLFFSGLILAILLPFQ 

YVPWLHAVYAALGAGVFTLFLALDTQLLMG 

NRRHSLSPEEYIFGALNIYLDIIYIFTFFLQLFG 

TMRR 


980 


2330 


A 


8305 


59 


857 


ASQLPDYSISPPSLPPRlSFHPSPTLARVAMAbf 

SEATQSHSISSSSFGAEPSAPGGGGSPGACPAL 

GTKSCSSSCAVHDLIFWRDVKKTGFVFGTTLI 

MLLSLAAFSVISWSYLILALLSVTISFRIYKSV 

IQAVQKSEEGHPFKAYLDVDITLSSEAFHNY 

MNAAMVHINRALKLIIRLFLVEDLVDSLKLA 

VFMWLMTYVGAVFNGITLLILAELLIFSVPIV 

YEKYKTQIDHYVG1ARDQTKSIVEK1QAKLPG 

1AKKKAP 


981 


2331 


A 


8308 


186 


1337 


TRMSRHEGVSCDACLKGNFRGRRYKCL1CY D 

YDLCASCYESGATTTRHTTDHPMQCDLTRVD 

FDLYYGGEAFSVEQPQSFTCPYCGKMGYTET 

SLQEHVTSEHAETSTEV1CPICAALPGGDPNH 

VTDDFAAHLTLEHRAPRDLDES SG VRH VRR 

MFHPGRGLGGPRARRSNMHFTSSSTGGLSSS 

QSSYSPSNREAMDPIAELLSQLSGVRRSAGGQ 

LNSSGPSASQLQQLQMQLQLERQHAQAARQ 

Q LET ARN ATRRTNTS S VTTTITQ ST ATTN IAN 

TESSQQTLQNSQFLLTRLNDPKMSETERQSM 

ESERADRSLFVQELLLSTLVREESSSSDEDDR 

GEMADFGAMGCVDIMPLDVALENLNLKESN 

KGNEPPPPPL 


982 


2332 


A 


8315 


1 


1004 


■ GSTHASADAWAQWFCIEALVMGAPV w ylv 
AAALLVGFILFLTRSRGRAASAGQEPLHNEEL 
AGAGRVAQPGPLEPEEPRAGGRPRRRRDLGS 
RLQAQRRAQRVAWAEADENEEEAVILAQEE 
EGVEKPAETHLSGKIGAKKLRKLEEKQARKA 
QREAEEAEREERXRLESQREAEWKKEEERLR 
LEEEQKEEEERKAREEQAQREHEEYLKLKEA 
FWEEEGVGETMTEEQSQSFLTEFINYIKQSK 
V VLLEDL ASQ VGLRTQDT1NRI QDLL AEGTTT 
GVIDDRGKFIYITPEELAAVANFIRQRGRVS1A 
FT AOASNSLIAWGRESPAQAPA 


983 


2333 


A 


8320 


244 


1420 


" RRR WRARGGL VPTLA WAEATGA Y VPUKUW 
DLPTWKRNFRSALNRKEGLRLAEDRSKDPHD 
PHKIYEFVNSGVGDFSQPDTSPDTNGGGSTSD 
TQEDILDEIXGNMVLAPLPDPGPPSLAVAPEP 
CPQPLRSPSLDNPTPFPNLGPSENPLKRLLVPG 
EEWEFEVTAFYRGRQVFOOTISCPEGLRLVGS 
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SEQ ID 
NO : of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 

1 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 
beginning 

location 
correspondi 
ng to first 
amino acid 
residue of 
peptide 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Ammo acid sequence (A^Alamne C=€ysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, GOlycine, H=Histidine, 
l=Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P^Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W-Tryptophan, 
Y=Tyrosinc, X=Unknown, *^Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 






i 








EVGDRTLPGWPVTLPDPGMSLTDRGVMS Y V 

RHVLSCLGGGLALWRAGQWLWAQRLGHCH 

TYWAVSEELLPNSGHGPDGEVPKDKEGGVF 

DLGPFIVGSLGPPDLITFTEGSGRSPRYALWFC 

VGESWPQDQPWTKRLVMVKVVPTCLRALVE 

MARVGGASSLENTVDLHISNSHPLSLTSDQY 

KAYLQDLVEGMDFQGPGES 


984 


2334 


A 


8321 


1 


1243 


ANMAPVEHVVADAGAFLRHAALQDIGKNIY 

TTREVVTEIRDKATRRRLAVLPYELRFKEPLPE 

YVRLVTEFSKXTGDYPSLSATDIQVLALTYQL 

EAEFVGVSHLKQEPQKVKVSSSIQHPETPLHIS 

GFHLPYKPKPPQETEKGHSACEPENLEFSSFM 

FWRNPLPNIDHELQELL1DRGEDVPSEEEEEEE 

NGFFDRKDDSDDDGGGWITPSNIKQIQQELE 

QCDVPEDVRVGCLTTDFAMQNVLLQMGLHV 

LAVNGMLIREARSYILRCHGCFKTTSDMSRV 

FCSHCGNKTLKKVSVTVSDDGTLHMHFSRNP 

KVLNPRGLRYSLPTPKGGKYAINPHLTEDQRF 

PQLRLSQKARQKTNVFAPDYIAGVSPFVENDI 

SSRS ATLQ VRD STLGAGRRRLNPNASRKKFV 

KKR 


985 


2335 


A 


8322 


352 


529 


RRNNIRQFIMKVCISGQARWLTPVVPVLWET 
EAGRSLELKSLRPAWATWGNP1STKINK 


986 


2336 


A 


8325 


89 


1172 


KMNTTDIADTTLDESI YSNYYL YESrPKPC-T 1 Kfc 

GIKAFGELFLPPLYSLVFVFGLLGNSVWLVL 

FKYKRLRSMTDVYLLNLAISDLLFVFSLPFWG 

YY AADQWVFGLGLCKMIS WMYL VGFYSGIF 

FVMLMSIDRYLAIVHAVFSLRARTLTYGVITS 

L AT W S V AVF ASLPGFLF STC YTERNHTYCKT 

KYSLNSTTWKVLSSLEIN1LGLVIPLGIMLFCY 

SMTOTLQHCK^fEKKNKAVKMIFAVVVLFLG 

FWTPYNIVLFLETLVELEVLQDCTFERYLDYA 

IQATETLAFVHCCLNPnYFFLGEKFRKYILQL 

FKTCRGLFVLCQ YCGLLQIY S ADTPS SS YTQS 

TMDHDLHDAL 


987 


2337 


A 


8326 


3 


470 


SLSAMRFLAATFLLLALSTAAQAEPVQFKDC 

GSVDGVIKEVNVSPCPTQPCQLSICGQSYSVN 

VTFTSNIQSKSSKAVVHGILMGVPVPFPIPEPD 

GCKSGINCPIQKDKTYSYLNKLPVKSEYPSIK 

LWEWQLQDDKNQSLFCWEIPVQIVSHL 


988 


2338 


A 


8335 


1205 


323 


VIKMALAARLLPQFLHSRSLPCGAVRLRTPA 

VAEVRLPSATLCYFCRCRLGLGAALFPRSAR 

ALAASALPAQGSRWPVLSSPGLPAAFASFPAC 

PQRSYSTEEKPQQHQKTKMIVLGFSNPINWV 

RTRHCAFLIWAYFDKEFSITEFSEGAKQAFAH 

VSKLLSQCKFDLLEELVAKEVLHALKEKVTS 

LPDNHKNALAANIDEIVFTSTGDISIYYDEKG 

RJCFVNILMCFWYLTSANIPSETLRGASVFQVK 

LGNQNVETKQLLSASYEFQREFTQGVKPDWT 

IARIEHSKLLE 


989 


2339 


A 


8349 


67 


185 


MSGFIHQLLlQ>rLFCVYHTRLKTSQGLCLLSL 
KSLHPMS 


990 


2340 


A 


8361 


210 


1115 


ASPFLRPQGHDSGEREPFSQTPGLMQPFSIPVQ 

ITLQG SRRRQGRT AFP A S GKXRETD Y SDGDPL 

DVHKRLPSSTGEDRAVMLGFAMMGFSVLMF 

FLLGTTILKPFMLSIQREESTCTAJHTDIMDDW 

LDCAFTCGVHCHGQGKYPCLQVFVNLSHPG 

QKALLHYNEEAVQINPKCFYTPKCHQDRNDL 

LNSALDKEFFDHKNGTPFSCFYSPASQSEDVI 



288 



WO 01/57188 



PCT/US01/03800 



I 


5EQID 5 
^0: of 1 
lucl- I 
sotide i 
seq- i 
uence 


>EQ ID P 
<0: of r 
peptide 
seq- 
jence 


4et 1 5 
lod I 

\ 
( 


>EQ I 
DNO: \ 
n i 
JSSN 1 
)9/496 
914 


3 redicted 1 
beginning i 
mcleotide 
ocation 
moires pondi 
ng to first 
amino acid 
residue of 
peptide 
sequence 


Predicted end j < 
■mcleotide 
ocation 
;orresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alamne C=Cysteine, 
D=Aspartic Acid, EOlutamic Acid, 
F=Phenylalanine, G=Glycine, H-Histidinc, 
I=Isolcucinc, K-Lysinc, L-Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Ghitamine, R=Arginine, S=Serine, 
T-Threonine, V-V aline, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/possible nucleotide deletion, Vpossible 
nucleotide insertion 
















LrKKYDQMAIFHCLFWPSLlLLGGALlVCjMV 
RLTQHLSLLCEKYSTWRDEVGGKVPYIEQH 
OFKLCIMRRSKGRAEKS 




991 


2341 


A 


8369 


9 


921 


SSVVEFSALSVSMACLSPSQLQKFQQDGhLVL 

EGFLSAEECVAMQQRIGEIVAEMDVPLHCRT 

EFSTQEEEQLRAQGSTDYFLSSGDKIRFPFEK 

GVFDEKGNFLVPPEKSINKIGHALHAHDPVFIC 

SITHSFKVQTLARSLGLQMPVWQSMYTFKQP 

HFGGEVSPHQDASFLYTEPLGRVLGVWIAVE 

DATLENGCLWFIPGSHTSGVSRRMVRAPVGS 

APGTSFLGSEPARDNSLFVPTPVQRGALVUH 

GEVVHKSKQNLSDRSRQAYTFHLMEASGTT 

WSPENWLOPTAELPFPQLYT 


992 


2342 


A 


8370 


906 


4 


MALSGNCSRYYPREQGSAVPNSFPEWELNV 

GGOVYFTRHSTLISIPHSLLWKMFSPKRDTAN 

DLAKDSKGRFF1DRDGFLFRY1LDYLRDRQW 

LPDHFPEKGRLKREAEYFQLPDLVKLLTPDEI 

KQSPDEFCHSDFEDASQGSDTRJCPPSSLLPAD 

RKWGFITVGYRGSCTLGREGQADAKFRRVPR 

ILVCGRI SLAKE VFGETLNESRDPDRAPERYTS 

RFYLKFKHLMGAPASNFILGFWGLGQNQDK 

HPVNIYLQQRSVIRPDLTSKKAGDLKGKGDA 

OEVSRRRRWLGDPEHL 


993 


2343 


A 


8379 


1 


2794 


MRMQRHKNDTMDFGDSGKRIGGOVLCLLH^ | 

SNTSFrXLNNNGFEDlVIVIDPSVPEDEKIIEQIE 

DNmTASTYLFEATEKRFFFKNVSILIPENWK 

ENPQYKRPKHENHKHADVIVAPPTLPGRDEP 

YTKQFTECGEKGEYIHFTPDLLLGKKQNEYG 

PPGKLFVHEWAHLRWGVFDEYNEDQPFYRA 

KSKKIEATRCSAGISGRNRVYKCQGGSCLSRA 

CRIDSTTKLYGKDCQFFPDKVQTEKASIMFM 

OSIDSWEFCNEKTHNQEAPSLQNIKCNFRST 

WEVISNSEDFKNTIPMVTPPPPPVFSLLKIRQRI 

VCLVLDKSGSMGGKDRLNRMNQAAKHFLLQ 

TVENGSWVGMVHFDSTATTVNKLIQIKSSDER 

NTLMAGLPTYPLGGTSICSGIKYAFQVIGELH 

SQLDGSEVLLLTDGEDNTASSCIDEVKQSGAI 

VHF1ALGRAADEAVIEMSKITGGSHFYVSDEA 

ONNGLIDAFGALTSGNTDLSQKSLQLESKGLT 

LNSNAWMNDTV1IDSTVGKDTFFLITWNSLPP 

S1SLWDPSGTIMENFTVDATSKMAYLSIPGTA 

KVGTWAYNLQAKANPETLTITVTSRAANSSV 

PPITVNAKMNKDVNSFPSPMIVYAEILQGYVP 

VLGANVTAFIESQNGHTEVLELLDNGAGADS 

FKNDGVYSRYFTAYTENGRYSLKVRAHGGA 

NTARLKLRPPLNRAAYIPGWVVNGEtEANPP 

RPEIDEDTQTTLEDFSRTASGGAFWSQVPSL 

PLPDQYPPSQITDLDATVHEDKI1LTWTAPGD 

NFDVGKVQRYLIRISASILXILRDSFDDALQVN 

TTDLSPKEANSKESFAFKPENISEENATHIF1AI 

KSIDKSNLTSKVSNIAQVTLFIPQANPDDIDPT 

PTPTPTPTPDKSHNSGVN1STLVLSVIGSWIV 

KFfl STP 


994 
995 


2344 
2345 


A 
A 


8385 
8390 


231 
194 


644 
3421 


~ rNSSPRTGRDHQELNLHl'ERDSRSQRAVLKJP 
RQNPGIFYWIFLPSRSHSASHGSRQRQVSCQG 
TODE1LKMRNTFAELKNSLEALSSRMDQAEE 
RJGTQAGVQWRDHGSLQPQPPEFKQCFHLSL 

PSSWDYRACLS 
""T^RKSSVVPPRGlRRGEKSDQDKiJUQKNKR 
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SEQID S 
NO: of > 
nucl- p 
cotidc s 
seq- u 
uence 


EQID h 
10: of h 
eptide 
eq- 
ence 


let S 

od I) 
ii 
I 

C 
5 


EQ P 
D NO: b 
i n 
JSSN 1 
9/496 c 
► 14 r 
i 
i 
1 


redicted F 
c ginning r 

ocation c 
orrespondi t 
ig to first £ 
imino acid < 
esidue of 
peptide 
sequence 


redicted end 1 / 
ucleotide I 
ocation 1 
^responding 1 
o last amino I 
icid residue < 
peptide ' 
sequence 


Kmino acid sequence (A=Alamne Cysteine, 
>Aspartic Acid, E=Glutamic Acid, 
-Phenylalanine, G=Glycinc, H=Histidine, 
=Isoleucine, K=Lysine, L=Leucine, 
vt=Methionine, N=Asparagine, P=Proline, 
5=Glutamine, R=Arginine, S=Serine, 
r-Threonine, V=Valine, W=Tryptophan, 
V=Tyrosine, X=Unknown, *=Stop codon, 
'^possible nucleotide deletion, \=possible 
nucleotide insertion = _ = — 














DFLSMKQSPALAPEERCRRAbSPkP VLRADD 

NNMGNGCSQKLATANLLRFLLLVLIPCICALV 

LLLEILLSYVGTLQKVYFKSNGSEPLVTDGEI 

OGSDVILTNTIYNQSTWSTAHPDQHVPAWT 

TDASLPGDQSHRNTSACMNITHSQCQMLPYK 

ATLTPLLSVVRNMEMEKFLKFFTYLHRLSCY 

QHIMLFGCTLAFPECIIDGDDSHGLLPCRSFCE 

AAKEG CES VLGMVNY S WPDFLRC S QFRNQT 

ESSNVSRICFSPQQENGKQLLCGRGENFLCAS 

GICIPGKLQCNGYNDCDDWSDEAHCNCSENL 

FHCHTGKCLNYSLVCDGYDDCGDLSDEQNC 

DCNPTTEHRCGDGRC1AMEWVCDGDHDCVD 

KSDEVNCSCHSQGLVECRNGQCIPSTFQCDG 

DEDCKDGSDEENCSVIQTSCQEGDQRCLYNP 

CLDSCGGSSLCDPNNSLNNCSQCEPITLELCM 

NLPYNSTSYPNYFGHRTQKEASISWESSLFPA 

LVOTNCYKYLMFFSCTILVPKCDVNTGEHIPP 

CRALCEHSKERCESVLGIVGLQWPEDTDCSQ^ 

FPEENSDNQTCLMPDEYVEECSPSHFKCRSGQ 

CVLASRRCDGQADCDDDSDEENCGCKERDL 

WECPSNKQCLKHTVICDGFPDCPDYMDEKN 

CSFCQDDELECANHACVSRDLWCDGEADCS 

DSSDEWDCVTLSINVNSSSFLMVHRAATEHH 

VCADGWQEILSQLACKQMGLGEPSVTKLIQE 

OEKEPRWLTLHSNWESLNGTTLHELLVNGQS 

CESRSKISLLCTKQDCGRRPAARMNKRILGGR 

TSRPGRWPWQCSLQSEPSGHICGCVL1AKKW 

VLTVAHCFEGRENAAVWKWLGINNLDHPS 

VFMQTRFVKTIILHPRYSRAVVDYDISIVELSE 

DISETGYVRPVCLPNPEQWLEPDTYCYITGW 

GHMGNKMPF1CLQEGEVRI1SLEHCQSYFDMK 

TITTRMICAGYESGTVDSCMGDSGGPLVCEK 

PGGRWTLFGLTSWGSVCFSKVLGPGVYSNVS 

YFVEWIKROIYIQTFLLN 


996 


2346 

t 
i 


A 


8392 


"^99 


3085 

i 


■ KVILSSEMSKTNKSKSGSRSSRSRSASK^KbK^ 
FSKSRSRSRSLSRSRKRRLSSRSRSRSYSPAHN 
RERNHPRVYQNRDFRGHNRGYRRPYYFRGR 
NRGFYPWGQYNRGGYGNYRSNWQNYRQAY 
SPRRGRSRSRSPKRRSPSPRSRSHSRNSDKSSS 
DRSRRSSSSRSSSNHSRVESSKRKSAKEKKSSS 
KDSRPSQAAGDNQGDEVKEQTFSGGTSQDTK 
ASESSKPWPDATYGTGSASRASAVSELSPRER 
SPALKSPLQSVVVRRRSPRPSPVPKPSPPLSST 
SQMGSTLPSGAGYQSGTHQGQFDHGSGSLSP 
SKKSPVGKSPPSTGSTYGSSQKEESAASGGAA 
YTKRYLEEQKTENGKDKEQKQTNTDKEKIKE 
KGSFSDTGLGDGKMKSDSFAPKTDSEKPFRG 
SOSPKRYKLRDDFEKKMADFHKEEMDDQDK 
DKAKGRKESEFDDEPKFMSKVIGANKNQEEE 
KSGKWEGLVYAPPGKEKQRKTEELEEESFPE 
RSKKEDRGKRSEGGHRGFVPEKNFRVTAYK 
AVQEKSSSPPPRKTSESRDKLGAKGDFPTGKS 
SFSITREAQVNVRMDSFDEDLARPSGLLAQER 
v-t pRDLVHSNKKEOEFRSIFQHIQSAQSQRSP 
SELFAQraVTl VHHVKEHHFGS SGMTLHERFT 
KYLKROTEQEAAKNKKSPEIHRRIDISPSTFRK 
HGLAHDEMKSPREPGYKAEGKYKDDPVDLR 
LDIERRKKHKERDLKRGKSRESVDSRDSSHSR 
ERSAEK.TEKTHKGSKKQKKHRRARDRSRSSS 
SSSQSSHSYKAEEYTEETEEREESTTGFDKSRL 
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SEQID 5 
NO: of 1 
nucl- I 
cotide J 
seq- ' 
uencc 

% 


>EQID } 
vJO: of 1 
peptide 
seq- 
jence 


vlet J 
lod 1 

i 


SEQ J 
DNO: I 
n i 
USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

ocation 

;orrespondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
ocation 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


tenino acid sequence (A=Alanme C-Cysteine, 
D=Aspartic Acid, EOlutamic Acid, 
F-Phenylalanine, G-Glycire, H-Histidine, 
1-lsoleucine, K=Lysinc, L-Leucine, 
M-Methionine, N-Asparaginc, P=ProIine, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosinc, X=Unknown, *= s Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion — 














GTKDFVGPSERGGGRARUTFQFRARGRCi WU 
RGNYSG^^NNNNSNNDFQKiWREEEWDPEYT 
PKSKJCYYLHDDREGEGSUkW V bKUKVjruj/vr 
PRGRGRFMFRKSSTSPKWAHDKFSGEEGEIE 
DDKSGTENREEKDNIQPTTE 


997 


2347 


A 


8398 


202 


552 


"CPALGGRQDLQGTRLLWAHDSGVGGQKARJs 
KQENLESLEATGREEEGGQGPPVTTKGVLLA 
LLMAGLALQPGTALLCVSCKAQVSNEDCLQ 
VFNCTOLGEOCV/TARIREWGDDSRQA 


998 


2348 


A 


8400 


"697 


301 


NPPSACTPGSCDSCSGRGRJJLAFDSYWSTNN 
MSDPRRPNKVLRYKPPPSECNPALDDPTPDY 
MNLLGMIFSMCGLMLKLKWCAWVAVYCSF1 
SFANSRSSEDTKQMMSSFMLSISAWMSYLQ 

fJPQPMTPPW 


999 


2349 


A 


8401 


93 


1126 • 


ASASHITSGHLRCFPGSEGVGTMARCFSLVLL 

LTSIWTTRLLVQGSLRAEELSIQVSCRIMG1TL 

VSKKANQQLNFTEAKEACRLLGLSLAGKDQ 

VETALKASFETCSYGWVGDGFVVISRISPNPK 

CGKNGVGVLIWKVPVSRQFAAYCYNSSDTW 

TNSCIPEnTTKDPIFNTQTATQTTEFIVSDSTYS 

VASPYSTIPAPTTTPPAPASTSIPRRKKLICVTE 

WMETSTMSTETEPFVENKAAFKNEAAGFGG 

VPTALLVLALLFFGAAAGLGFCYVKRYVKAF 

PPTNKNQQKEM1ETKVVKEEKANDSNPNEES 

^KrmKNPEESKSPSKTTMRCLEAEV 


1000 


2350 


A 


8406 


2 


777 


KERCQFWKPMLSTVGSFLQDLQNEDKG1K 1 
AAII^AJDGNMISASTLMDILLMNDFKJLVINKI 
AYDVQCPKREKPSNEHTAEMEHMKSLVHRL 
FTILHLEESQKKREHHLLEK1DHLKEQLQPLE 
OVKAGIEAHSEAKTSGLLWAGLALLSIQGGA 
L A WLTWWVY S WD IMEPVTYF ITF AN SMVFF 
AYFIVTRQDYTYSAVKSRQFLQFFHKKSKQQ 
HFDVQQYNKLKEDLAKAKESLKQARHSLCL 

OMOVEELNEKN 


1001 


2351 


A 


8410 


1400 


264 


' VGF WERPLRSSRWFRRSLRRWEMLAKAAKU 
TG ALLLRGSLLASGRAPRRAS SGLPRNTVVLF 
VPQQEAWWERMGRFHRILEPGLNILIPVLDR 
IRYVQSLKEIVINVPEQSAVTLDNVTLQIDGV 
LYLR1MDPYKASYGVEDPEYAVTQLAQTTM 
RSELGKLSLDKVFRERESLNASIVDAINQAAD 
C WGIRCLRYEIKDIHVPPRVKESMQMQ V b At 
RRKRATVLESEGTRES AINV AEGKKQAQILAS 
EAEKAEQINQAAGEASAVLAKAKAKAEAIRI 
L AAALTQHNGD AAA SLTV AEQ YV S AFSKLA 
KDSNTILLPSNPGDVTS\fVAQAMGVYGALT 
KAPVPGTPDSLSSGSSRDVQGTDASLDEELDR 


1002 


2352 


A 


8421 


134 


941 


" NRENLLESRMMDPCSVGVQLRTTNECHK1Y 
YTRHTGFKTLQELSSNDMLLLQLRTGMTLSG 
NKTICFHHVKIYlDRFEDLQKSCCDPf 
AKKNLHVIDLDDATFLSAKFGRQLVPGWKLC 
PKCTQIINGSVDVDTEDRQKRKPESDGRTAK 
ALRSLQFTNPGRQTEFAPETGKREKRRLTKN 
ATAGSDRQVIPAKSKVYDSQGLLIFSGMDLC 
DCLDEDCLGCFYACPACGSTKCGAECRCDRK 

WLYEOIEIEGGEIIHNKHAG 


1003 


2353 


A 


8427 


3 


1416 


TEWGLSGSCPGCSPLEPGSRGRGAAAWKLLK 
CRRLPEPSPFLTQPNLAQSQPPAPVPVTDPSVT 

mhpavflslpdlrcsllllvtwvftpvtteit 



291 



WO 01/57188 



PCT/US01/03800 



SEQ [D S 
NO: of 1 
nucl- 
eotide 
seq- 
uence 


SEQ id r 

MO: of 1 
xptidc 
seq- 
uence 


viet 5 
lod 1 

l 


>EQ 1 
DNO: 1 
n 

USSN 
D9/496 
914 


Predicted 
xginning 

1 UvlCU LIU. t 

ocation 
correspond! 
ng to first 
amino acid 
residue of 
peptide 
sequence 


Predicted end 
nucleotide 
ocation 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A«Alamne CHJysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
^Phenylalanine, G=Glycine, H=Histidine, 
I=lsoleucine, K^Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P^Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T-Thrconine, V=Valine, W-Tryptophan, 
Y=Tyrosinc, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 














SLDTEN1DE1LNN AD VALVN F V AD W CRF SQM 

LHPIFEEASDVIKEEFPNENQVVFARVDCDQH 

SD1AQRYRISKYPTLKLFRNQMMMKREYRGQ 

RSVKALADYIRQQKSDP1QEIRDLAEITTLDRS 

KRNIIGYFEQKDSDNYRVFERVANILHDDCAF 

LSAFGDVSKPERYSGDN1IYKPPGHSAPDMVY 

LGAMTNFDVTYNW1QDKCVPLVREITFENGE 

ELTEEGLPFLILFHMKEDTESLEIFQNEVARQL 

ISEKGTTNFLHADCDKFRHPLLHIQKTPADCP 

VIAIDSFRHMYVFGDFKDVLIPGKLKQFVFDL 

HSGKLHREFHHGPDPTDTAPGEQAQDVASSP 

PES SFOKLAPSE YRYTLLRDRDEL 


1004 


2354. 


A 


8432 


910 


387 


GLSRKLRAGFLPGFCRVSPCGSWVVETLVKJ4 

ACAAARSPADQDRFICIYPAYLNNKKTIAEGR 

RI PISKAVENPTATEIQD VCS AVGLNVFLEKN 

KMYSREWNRDVQYRGRVRVQLKQEDGSLC 

LVQFPSRKSVMLYAAEMrPKLKTRTQKTGGA 

DOSLOOGEGSKKGKGKKKK 


1005 


2355 


A 


8453 


90 


530 


OSHETKMQSGTHWRVLGLCLLSVGVWUQU 

GNEEMGGITQTPYKVSISGTTVILTCPQYPGSE 

ILWQHNDKNIGGDEDDKNIGSDEDHLSLKEF 

SELEQSGYYVCYPRGSKPEDANFYLYLRARG 

NPGLONRYHRLFREDHSKGHSQ 


1006 


2356 


A 


8458 


3 


307 


A VQRIRHEMNIFRLTGDLSHLAAI VILLLRI w 
KTRSCAGISGKSQLLFALVFTTRYLDLFTSFIS 
LYNTSMKVWYAIHRNVFHLQCTGLWTLNLC 


1007 


2357 


A 


8459 


43 


553 


GAGAGGDWAAMDKXKKVLSGQDIEDKbUi. . 

SEVVEASSLSWSTRIKGFIACFAIGILCSLLGT 

VLLWVPRKGLHLFAVFYTFGN1ASIGSTIFLM 

GPVKQLKRMFEPTRLIATIMVLLCFALTLCSA 

FWWHNKGLALIFCILQSLALTWYSLSFIPFAR 

DAVKKCFAVCLA 


1008 


2358 


A 


8462 


487 


150 


AQDlRSVHSLGQKSrFVKHFRTLSHLHUivFUt J 
PPHWPPQERSPPSHPCMPSHRPQIPQLSNSGPS 
DPRWGCVGPSMPTSTCLPGAVEASTTKASLP 
KCPVDSSLPTPEACFL 


1009 


2359 


A 


8465 


134 


954 


' ETRVKTSLELLRTQLEFIGTVGNTIM'l Sgp V r 
NETIIVLPSNVINFSQAEKPEPTNQGQDSLKKH 
LHAEIKVIGTIQILCGMMVLSLGIILASASFSPN 
FTQVTSTLLNSAYPFIGPFFFI1SGSLSLATEKRL 
TKLLVHSSLVGSILSALSALVGFIILSVKQATL 
NPASLQCELDKNNIPTRSYVSYFYHDSLYTTD 
CYTAKASLAGTLSLML1CTLLEFCLAVLTAVL 
RWKQAYSDFPGSVLFLPHSYIGNSGMSSKMT 
HDCGYEELLTS 


1010 


2360 


A 


8468 


2 

! 


473 


" K YRYRRP YP VMRKJCQ VGPAGLAF1LN lsr v A 
HRVALCHLAGCQEQAAWYHTLQILFFLVSAY 
FFSCPVPEKYFPGSCDIVGHGHQIFHAFLSICT 
LSQLEAJLLDYQGRQEIFLQRHGPLSVHMACL 
SFFFLAACSAATAALLRHKVKARLTKKDS 


1011 


2361 


A 


8478 


: 5 

1 

i 

! 


409 


" TEL^QLEKAHPPADMGRRKSKRKPPPKJ^^ l 
rTT FTnTrrrPFnsTHFKSrDVKMDRARNTGVI 
SCTVCLEEFQTPITCILGNLGFFQRVGRGEESG 
PCSSGPLCALVQGQSRPEEQVPPSDFCGVRRC 

RAGFQCQ 


1012 


2362 


A 


8481 


2810 


1652 


RTSTQKWQSVFNDSQEHLERFYCNPbNUKM 
RMKVGOQEFWADLNAMNVYETTEFDQLRR 
LSTPPSSNVNSniiTVWKFFCRDHFGWREYPE 
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nucleotide 
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ng to first 

amino acid 

residue of 
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to last amino 
acid residue 
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Amino acid sequence (A= Alanine C=Cysteine, 
I>Aspartic Acid, E=Glutamic Acid, 
F-Phenylalaninc, OGlycine, H==Histidinc, 
I=Isoleucinc, K=Lysinc, L=Leucme, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
^Threonine, V=V aline, W=Tryptophan : 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, V-possible 
nucleotide insertion 














S V1KJLIEEAN SRGLKE VRFMM WNNHY 1LHN 5 

FFRREDCRRPLFRSCFILLPYLQTLGGVPTQAP 

PPLEATSSSQUCPDGVTSANFYPETWVYMHP 

SQDF1QVPVSAEDKSYRIIYNLFHKTVPEFKYR 

ILQn^RVQNQFLWEKYKRKK£YMNRKMFGR 

DRIIr^RHLFHGTSQDVVTCICKHNFDPRVCG 

KHATMFGQGSYFAKKASYSHNFSKKSSKGV 

HFMFL AKVLTGRYTMG SHGMRRPPPVNPGS 

VTSDLYDSCVDNFFEPQIFVIFNDDQSYPYFV1 

OYEEVSNTVS1 


1013 


2363 


A 


8488 


2 


517 


IENCRTRLRQAWHEVCGNKMAAPIPQGFSLL. 
SRFLGWWFRQPVLVTQSAATVPVRTKKRFTP 
PIYQPKFKTEKJEFMQHARKAGLVIPPEKSDRS 
IHLACTAG1FDAYVPPEGDARISSLSKEGUER 
TERMKKTMASQ VS IRRIKD YDANFKIKDFPE 
KAKD1FIEGSPLY 


1014 


2364 


A 


8501 


363 


17 


Y1RTG Y V YICE Y AQLMYTYYIRT AYV Y ICILY 
AQLMYTYVLYTHSLCIHMYSIRTAYVY1CIIY 
AQI^YVFYTHRLCIHMYSIRTDYVYICILY 
AOLMYTYVFYTHSYMSDE 


1015 


2365 


A 


8504 


3 


2190 


* NSSEHFSQAPQRLSFYSWYGSARLFRFKVF™ 
AVLLRWLLQVSRESGAACTDAEITVHFRSGA 
PPVTNPLGTSFPDDTAVQPSFQVGVPLSTTPRS 
NASVNVSHPAPGDWFVAAHLPPSSQK1ELKG 
LAPTCAYVFQPELLVTRWEISIMEPDVPLPQ 
TLLSHPSYLKVFVPDYTRELLLELRDCVSNGS 
LGCPVRLTVGPVTLPSNFQKVLTCTGAPWPC 
RLLLPSPPWDRWLQVTAESLVGPLGTVAFSA 
VAALTACRPRSVTIQPLLQSSQNQSFNASSGL 
LSPSPDHQDLGRSGRVDRSPFCLTNYPVTRED 
MD W S VHFQPLDR VS VR VCSDTP S VMRLRL 
NTGMDSGGSLTISLRANKTEMRNETVVVACV 
NAASPFLGFNTSLNCTTAFFQGYPLSLSAWSR 
RANLIIPYPETDNWYLSLQLMCPENAEDCEQ 
AWHVETTLYLVPCLNDCGPYGQCLLLRRHS 
YLYASCSCKAGWRGWSCTDNSTAQTVAQQR 
AATLLLTLSNLMFLAPIAVSVRRFFLVEASVY 
AYTMFFSTFYHACDQPGEAVLCILSYDTLQY 
CDFLGSGAAIWVTILCMARLKTVLKYVLFLL 
GTLVIAMSLQLDRRGNfWNMLGPCLFAFVTM 
ASMWAYRCGHRRQCYPTSWQRWAFYLLPG 
VSN1ASVGLAJYTSMMTSDNYYYTHSIWHILL | 
AGSAALLLPPPE>QPAEPWACSQKFPCHYQ1C 
KNDREELYAVT 


1016 


2366 


A 


8511 


1 


453 


" ' KW YPSGPVR1PGRF YYKLPAGHRRCKMAFAK 
KGGEKKKGRSA1NEVVTREYTINIHKRIHGVG 
FKKRAPRALKEIRKFAMKEMGTPDVRIDTRL 
NKAVWAKGIRN\TYRIRVRLSRKRNEDEDSP 
NKLYTLVTYA^VTTFKNLQTVNVDEN 


1017 


2367 


A 


8513 


54 


1196 


~ LERTPASADMAWTKYQLFLAGLMLVIUSlWi 
LSAKWADNFMAEGCGGSKEHSFQHPFLQAV 
GMFLGEFSCLAAFYLLRCRAAGQSDSSVDPQ 
QPFNPLLFLPPALCDMTGTSLMYVALNMTSA 
SSFQMLRGAVIIFTGLFSVAFLGRRLVLSQWL 
GILATIAGLVWGLADLLSKHDSQHKLSEVIT 
GDLLDMAQIIVAIQMVLEEKJVYKHNVHPLR 
AVGTEGLFGFVILSLLLVPMYYIPAGSFSGNP 
RGTLEDALDAFCQVGQQPLIAVALLGNISS1A 
FFOTAGISVTKELSATTRMVLDSLRTVVIWAL 
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seq- 
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hod 
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in 
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beginning 

nucleotide 
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correspondi 

ng to first 

amino acid 

residue of 
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nucleotide 
location 
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to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
F-Phenylalaninc, G-Glycinc, H=Histidine, 
Msoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=V aline, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \«=possible 
nucleotide insertion 














SLALGWEAFHALQILGFLELLIGTALYNGLHR 
PLLGRLSRGRPLAEESEQERLLGGTRTPINDA 

S 


1018 


2368 


A 


8518 


324 


694 


SPFWTEKRRMEKPLFPLVPLHWFGFGYTALV 
VSGGIVGYVKTGSVPSLAAGLLFGSLAGLGA 
YQLYQDPRNVWGFLAAT^VTFVGVMGMRS 
YYYGKFMPVGLIAGASLLMAAKVGVRMLM 

TSD 


10)9 


2369 


A 


8526 


2 


1787 


VSAAAVNMEPPDAPAQARGAPRLLLLAVLL 

AAHPDAQAEVRLSVPPLVEVMRGKSVILDCT 

PTGTHDHYMLEWFLTDRSGARPRLASAEMQ 

GSELQVTMHDTRGRSPPYQLDSQGRLVLAEA 

QVGDERDYVCVVRAGAAGTAEAAARLNVF 

AKPEATEVSPNKGTLSVMEDSAQE1ATSNSRN 

GNPAPK1TWYRNGQRLEVPVEMNPEGYMTS 

RTVREASGLLSLTSTLYLRLRKDDRDASFHC 

AAHYSLPEGRHGRLDSPTFHLTLHYPTEHVQ 

FWVGSPSTPAGWVREGDTVQLLCRGDGSPSP 

EYTLFRLQDEQEEVLNVNLEGNLTLEGVTRG 

QSGTYGCRVEDYDAADDVQLSKTLELRVAY 

LDPLELSEGKVLSLPLNSRAWNCSVHGLPTP 

AJLRWTKDSTPLGDGPMLSLSSITFDSNGTYYC 

EASLPTVPVLSRTQNFTLLVQGSPELKTAEIEP 

KADGSWREGDEVTLICSARGHPDPKLSWSQL 

GGSPAEPIPGRQGWVSSSLTLKVTSALSRDGI 

SCEASNPHGNKRHVFHFGTVSPQTSQAGVAV 

MAVAVSVGLLLLWAVFYCVRRKGGPCCRQ 

RREKGAP 


1020 


2370 


A 


8530 


2 


1200 


PRVRLLRPSRSRSCRGLLSTRAPGPSPFRSLHi) 

SPLLPHAMKSPFYRCQNTTSVEKGNSAVMGG 

VLFSTGLLGNLLALGLLARSGLGWCSRRPLR 

PLPSVFYMLVCGLTVTDLLGKCLLSPWLAA 

YAQNRSLRVLAPALDNSLCQAFAFFMSFFGL 

SSTLQLLAMALECWLSLGHPFFYRRHITLRLG 

AL V AP WS AFSL AFC ALPFMGFGKJFVQYCPG 

TWCFIQMVHEEGSLSVLGYSVLYSSLMALLV 

LATVLCNLGAMRNLYAMHRRLQRHPRSCTR 

DCAEPRAOGREASPQPLEELDHLLLLALMTV 

LFTMCSLPVTYRAYYGAFKDVKEKNRTSEEA 

EDLRALRFLSVISIVDPWIFIIFRSPVFRIFFHKI 

FIRPLR YRSRCSN STNMESSL 




J-j 1 1 




8536 


1 


237 


RRGEIDMATEGDVELELETETSGPERPFEKPK 
KHDSGAADLERVTDYAEEKEIQSSNLETAMS 
VIGDRRSREQKAKQER 


1022 


2372 


A 


8537 


94 


541 


RKERRRRRRRMEAVVFVFSLLDCCALIFLSV 
YFHTLSDLECDYINARSCCSKLNKWVTPELIG 
HT1VTVLLLMSLHWFIFLLNLPVATWNTYRYI 
MVPSGNMGVFDPTE1HNRGQLKSHMKEAMI 
KLGFHLLCFFMYLY5MILALIND 


1023 


2373 


A 


8540 


26 


431 


' RMMKCPQALLAIFWLLLSWVSSEDKWQSPL 
SLVVHEGDTVTLNCSYEVTTsTFRSLLWYKQEK 
KAPTFLFMLTSSGIEKKSGRLSSILDKKELSSIL 
xttt a TnTr.nc a tvt CAVFAOC^l VTPSLYSNS 
TAEALQL 


1024 


2374 


A 


8544 


1731 


743 


GMU.RYSPIAVVNIVGEAGRDLRRRRAVAVT 

AEKMAVLAPL1ALVYSVPRLSRWLAQPYYLL 

SALLSAAFLLVRKLPPLCHGLPTQREDGNPCD 

FDWREVEILMFLSAJVMMKNRRSITVEQH1GN 

IFMFSKVANTILFFRLDIRMGlXYITLCrVFLM 
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seq- 
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SEQ ID 
NO: of 
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seq- 
uence 
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hod 


SEQ 
ID NO: 

in 

USSN 
09/496 i 
914 
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beginning 

location 
correspond! 
ng to first 
amino acid 
residue of 

nr-ntide 

sequence 


Predicted end 
nucleotide 
location 
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to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine <J=Cysteine, 
D=Aspartic Acid, E^jlutamic Acid, 
^Phenylalanine, G-Glycinc, H-Histidine, 
I=Isoleucinc, K=Lysine, L=Lcucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Argininc, S= Serine, 
T=Threonine, V= Valine, W^Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/possible nucleotide deletion, \=possiblc 
nucleotide insertion 














TCKPPLYMGPEYIKYFNDKTIDEELERDKRVT 

W1VEFFANWSNDCQSFAPIYADLSLKYNCTG 

LNFGKVDVGRYTDVSTRYKVSTSPLTKQLPT 

LILFQGGKEAMRRPQIDKKGRAVSWTFSEEN 

VIREFNLNELYQRAKKLSKAGDNIPEEQPVAS 

TPTTVSDGENKKDK 


1025 


2375 


A 


8546 


2194 


1707 


IWHKTMASLKCSTVVCVICLEKPKYRCFA 

CRVPYCSWCFRKHKEQCNPETRPVEKKIRS 

ALPTKTVKPVENKDDDDSIADFLNSDEEEDR 

VSLQNLKM^GESATLRSLIXNPHLRQLNfVNL 

DQGEDKAKLMRAYMQEPLFVEFADCCLGIV 

FpSQNKRS 


1026 


2376 


A 


8547 


1078 


594 


VGMELPAVNLKVILLGHWLLTTWGCIVhbUb 

YAWANFTILALGVWAVAQRDSIDA1SMFLGG 

LLATIFLDIVHISrFYPRVSLTDTGRFGVGMAIL 

SLLLKPLSCCFVYHMYRERGGELLVHTGFLG 

SSQDRSAYQTIDSAEAPADPFAVPEGRSQDAR 

GY 


1027 


2377 


A 


8557 


1 


340 


DFLGP ASPQEEGG SESSTMTELET AMGM1 1 V v 
FSRYSGSEGSTQTLTKGELKVLMEKELPGFLQ 
SGKDKDAVDKLLKDLDANGDAQVDFSEFIVF 
VAAITSACHKYFEKAGLK 


1028 


2378 


A 


8569 


20 


963 


KMAATLGPLGSWQQWRRCLSARDGSRRLLL 

LLLLGSGQGPQQVGAGQTFEYLKREHSLSKP 

YQGEAPRPCFLRDWELQVHFKIHGQGKKNL 

HGDGLAIWYTKDRMQPGPVFGNMDKFVGLG 

VFVDTYPNEEKQQERVFPYISAMYNNGSLSY 

DHERDGRPTELGGCTAIVRNLHYDTFLVIRY 

VKRHLTIMMDIDGKHEWRDCIEVPGVRLPRG 

YYFGTSSITGDLSDNHDVISLKLFELTVERTPE 

EEKLHRDVFLPSVDNMKLPEMTAPLPPLSGL 

ALFLIVFFSLVFSVFAIV1GIILYNKWQEQSRK 

RFY 


1029 


2379 


A 


8572 


1 


578 


AAAASHRSRARSRPRRVSSGPAPRRAQSSAG 

RVASGLDSAPLCTMARALCRLPRRGLWLLLA 

HHLFMTTACQEANY G ALLRELCLTQFQVDM 

EAVGETLWCDWGRTIRSYRELADCTWHMAE 

KLGCFWPNAEVDRFFLAVHGRYFRSCP1SGR 

AVRDPPGSILYPFIVVPITVTLLVTALVVWQS 

KRTEGIV 


1030 


2380 


A 


8574 


1352 


372 


DSSTVKGGSESRHLCLIPDLKGKARTREASSG 

SRTCGRRTSLCTSAKSSWTYRSGRLSWQSIKG 

THLTITQALRQPLHRAPLLPGQLCWSPRPLEK 

NKAMGRPLLLPLLLLLQPPAFLQPGGSTGSGP 

SYLYGVTQPKHLSASMGGSVEIPFSFYYPWEL 

AIVPNVRISWRRGHFHGQSFYSTRPPSIHKDY 

VNRLFLNWTEGQESGFLR1SNLRKEDQSVYF 

CRVELDTRRSGRQQLQSIKGTKLTITQAVTTT 

TTWRPSSTTT1AGLRVTESK.GHSESWHLSLDT 

AIRVALAVAVLKTVTLGLLCLLLLWWRRRKG 

SRAPSSDF 


1031 


2381 


A 


8580 


905 


340 


' RRTAGIYPCFPKPGRTRHALCSWLLLLl ugL 

a. rnnrPirOPAXAlU/nVV A fl<?TJ R G ARIL 

AFDDFQbDLAMM W^K i A.KJOt\iKoivir 

FHGVTCAGGFAIVYYLIQKFHSRALYYKLAV 

EQLQSHPEAQEALGPPLNIHYLKL1DRENFVDI 

VDAKLKJPVSGSKSEGLLYVHSSRGGPFQRW 

HLDEVFLELKDGQQIPVFKLSGENGDEVKKE 


1032 


2382 


A 


8593 


2558 


961 


" " RRRPRLLPGAEPCEPRVGPRRADMGCSAKAR 
WAAGALGVAGLLCAVLGAVMIVMVPSLIKQ 
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F=Phenylalanine, GOlycine, H-Histidine, 
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/=possible nucleotide deletion, V=possible 
nucleotide insertion 














QVLKNVRIDPSSLSFNMWKEIPIPFYLSVYFKD 

VMNPSEILKGEKPQVRERGPYVYREFRHKSNI 

TFNNNDTVSFLEYRTFQFQPSKSHGSESDYIV 

MPNILVLGAAVMMENKPMTLKJ.IMTLAFTTL 

GERAFMNRTVGEIMWGYKDPLVNLINKYFP 

GMFPFKDKFGLFAELNNSDSGLFTGFTGVQNI 

SRIHLVDKWNGLSKVDFWHSDQCNMINGTS 

GQMWPPFMTPESSLEFYSPEACRSMKLMY1CE 

SGVFEGIPTYRFVAPKTLFANGSIYPPNEGFCP 

CLESG1QNVSTCRFSAPLFLSHPHFLNADPVL 

AEAVTGLHPNQEAHSLFLDIHPVTGIPMNCS-V 

KLQLSLYMKSVAGIGQTGKJEPWLPLLWFA 

ESGAMEGETLHTTYTQLVLMPKVMHYAQYV 

LLALGCVLLLVPVICQIRSQEKCYLFWSSSKK 

GSKDKEAIQAYSESLMTSAPKGSVLQEAKL 


1 oil 


2383 


A 


8595 


595 


767 


AHLPDTLLLPPHSPTVPTPKSFQCSQKACFSRS 
FCLLLSLVSSSLVSLSLCPPLTQA 


1034 


2384 


A 


8597 


640 


164 


VTTSCIIPFAFGLGVRASERLAEIDMPYLLKYQ 

PMMQTIGQKYCMDPAVIAGVLSRKSPGDKIL 

VNMGDRTSMVQDPGSQAPTSWISESQVFQTT 

EVLTTRITELQRRFPTWTPDQYLRGGLCAYSG 

GAGY\^SSQDLSCDFCNDVLARAKYLKRHG 

F 


1035 


2385 


A 


8603 


936 


204 


AMASTLEYSPSPLRRLVGPAAGFSRAARADL 

SWDPMAFFTGLWGPFTCVSRVLSHHCFSTTG 

SLSAIQKMTRVRVVDNSALGNSPYHRAPRCI 

HVYKXNGVGKVGDQILLAIKGQKKKALIVG 

HCMPGPRMTPRFD SNNWLIEDNGNP VGTRJ 

KTPIPTSLRKREGEYSKVLAIAQNFV 


1036 


2386 


A 


8606 


1 


562 


PTRAHSFDLCCSPCRRRLLGREEAGEEPTSPV 

TQYLQPRSPEECKMFACAKLACTPSLIRAGSR 

VAYRPISASVLSRPEASRTGEGSTVFNGAQNG 

VSQLIQREFQTSAISRDIDTAAKFIGAGAATVG 

VAGSGAGIGTVFGSLnGYARNPSLKQQLFSY 

AJLGFALSEAMGLFCLNfVAFLILFAM 


1037 


2387 

i 


A 


8615 


2 

i 


2364 


SPGPSLPESAESLDGSQEDKPRGSCAEFlhl D 1 

GMVAHINNSRLKAKGVGQHDNAQNFGNQSF 

EELRAACLRKGELFEDPLFPAEPSSLGFKDLG 

PN SKNVQNI S WQRPKDHNNPLFIMDGI SPTDI 

CQGILGDCWLLAAIGSLTTCPKLLYRWPRG 

QSFKKNYAGIFHFQIWQFGQWVNVWDDRL 

PTKNDKLVFVHSTERSEFWSALLEKAYAKLS 

GSYEALSGGSTMEGLEDFTGGVAQSFQLQRP 

PQNLLRLLRKAVERSSLMGCSIEVTSDSELES 

MTDKMLVRGHAYSVTGLQDVHYRGKMETLI 

RVRNPWGRIEWNGAWSDSAREWEEVASDIQ 

MQLLHKTEDGEFWMSYQDFLNNFTLLEICNL 

TPDTLSGDYKSYWHTTFYEGSWRTGSSAGGC 

RNHPGTFWTNPQFKJSLPEGDDPEDDAEGNV 

WCTCLVALMQKNWRHARQQGAQLQTIGFV 

LYAVPKEFQNIQDVHLKKEFFTKYQDHGFSEI 

FTNSREVSSQLRLPPGEYIIIPSTFEPHRDADFL 

LRVrrEKHSESWELDEVNYAEQLQEEKVSED 

DMDQDFLHLFKIVAGEGKErGVYELQRLLNR 

MAIKFKSFKTKGFGLDACRCMINLMDKDGSG 

KLGLLEFKILWKKLKKWMDIFRECDQDHSGT 

LNSYEMRLVIEKAGIXLNNKVMQVLVARYA 

DDDLITOFDSFISCFLRLKTMFTFFLTMDPKNT 

GH1CLSLEQVLGEGWEGICR1APACPSTPPPPS 
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/^possible nucleotide deletion, Y-possible 
nucleotide insertion 














SDVPGPASCPRLFPPWDLLPVSTVAADDHVGI 
EAL 


1038 


2388 


A 


8621 


3 


1494 


RSRMARAPLGVLLLLGLLGRGVGKNEELRLY 

HHLFNKYDPGSRPVREPEDTVTISLKVTLTNL 

ISLNEKEETLTTSVWIGIDWQDYRLNYSKDDF 

GGlETLRVPSELVWLPEIVLENNIDGQFGVAy 

DANVLVYEGGSVTWLPPAIYRSVCAVEVTYF 

PFDWQNCSLIFRSQTYNAEEVEFTFAVDNDG 

KTINKIDIDTEAYTENGEWAIDFCPGVIRRHH 

GGATDGPGETDVIYSLIIRRKPLFYVINIIVPCV 

LISGLVLLAYFLPAQAGGQKCTVSINVLLAQT 

WLFL1AQKJPETSLSVPLLGRFLIFVMVVATLI 

VMNCVIVLNVSQRTPTTHAMSPRLRHVLLEL 

LPRLLGSPPPPEAPRAASPPRRASSVGLLLRAE 

ELILKKPRSELVFEGQRHRQGTWTAAFCQSL 

GAAAPEVRCCVDAVNFVAESTRDQEATGEE 

VSDWVRMGNALDNICFWAALVLFSVGSSLIF 

LGAYFNRVPDLPYAPCIQP 


1039 


2389 


A 


8636 


1 


900 


PGRERPGGGGARRRPQHLPALLPSERPDCJATL 

QAMENELPVPHTSSSACATSSTSGASSSSGCN 

NSSSGGSGRPTGPQISVYSGIPDRQTVQVIQQ 

ALHRQPSTAAQYLQQMYAAQQQHLMLQTA 

ALQQQHLSSAQLQSLAAVQQASLVSNRQGST 

SGSNVSAQAPAQSSSINLAASPAAAQLLNRA 

QSVNSAAASGIAQQAVLLGNTSSPALTASQA 

QMYLRAQMLIFTPTATVATVQPELGTGSPAR 

PPTPAQVQNLTLRTQQTPAAAASGPTPTQPVL 

PSLALKPTPGGSQPLPTPA 


1040 


2390 


A 


8645 


98 


1388 


" ASQLAFGGKLTSTPSRDFQGCGRGAVTCCSF 
HEHRHQSGRCLSTGMAPNLKGRPRKKKPCPQ 
RRDSFSGVKDSNNNSDGKAVAKVKCEARSA 
LTKPKNNHNCKKVSNEEKPKVAIGEECRADE 
QAFLVALYKYMKERKTPIERIPYLGFKQINLW 
TMFQAAQKLGGYETITARRQWKHIYDELGG 
NPG STS AATCTRRHYERLILP YERFIKGEEDKP 
LPPIKPRKQENS SQENENKTKVSGTKR1KHEIP 
KSKKEKEN APKPQD AAEVS SEQEKEQETLISQ 
KSIPEPLPAADMKKKIEGYQEFSAKPLASRVD 
PEKDNETDQGSNSEKVAEEAGEKGPTPPLPSA 
PLAPEKDSALVPGASKQPLTSPSALVDSKQES 
KLCCFTESPESEPQEASFPRLPHHTGHRWQTR 
MRRRMTNCPPWQITLPTAP 


1041 


2391 


A 


8646 


113 


1492 


LLQEMCTKTIPVLWGCFLLWNLYVSSSQTIYP 

GIKARITQRALD YG VQA GMKMIEQMLKEKK 

LPDLSGSESLEFLKVDYVNYNFSN1KISAFSFP 

NTSLAFVPGVGIKALTNHGTAN1STDWGFESP 

LFVLYNSFAEPMEK^ILICNLNEMLCPIIASEVK 

ALNANLSTLEVLTKIDNYTLLDYSLISSPEITE 

NYLDLNLKGVFYPLENLTDPPFSPVPFVLPER 

SNSNaYIGlAEYFFKSASFAHFTAGVFNVTLS 

TEEISNHFVQNSQGLGNVLSRIAEIYILSQPFM 

VRIMATEPPIINLQPGNFTLDIPASuVuNlLTQPK 

kj cTVFT I v ^ MDFV A STS VGL V1LGORL VCSLS 

LNRFRLALPESNRSNIEVLRFENILSSILHFGVL 

PLANAKLQQGFPLPNPHKFLFVNSDIEVLEGF 

LL1 STDLK YETS SKQQPSFHVWEGLNLISRQW 

RGKSAP 


1042 


2392 


A 


8672 


538 


170 


" ARRIARTRESKAAVSQDNVPALQPGKKKXLR 
LGGKJCKJCFKFFRLPKEFKKQLMYSPSNFKKM 



297 



WO 01/57188 



PCT/TJS01/03800 



SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 

hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted | 

beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, 
D=Aspartic Acid, EKjlutamic Acid, 
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TSLAGNTVQCXNKLKYVTYSAQYPAYGNrn 
LDMITSTDHVLEQDFWICFTFYSVKERQI 


1043 


2393 


A 


8688 


359 


17 


GLKTRAPATPTFQREVLGPAXQDMQRRCPRJ 
GLMTSLLKPDCRRWRDYKRWKSGGFTGESC 
HHADTLGDRGGLQGDHSELLQWQKRILRTE 
GEPSPKYISKNIFPICSYITGFL 


1044 


2394 


A 


8718 ; 


292 


1490 


GTVKTS VATPIT AGHSCSSGGVLQVKSPA1 QS 

GFKFTSKMEDFNMESDSFEDFWKGEDLSNYS 

YSSTLPPFLLDAAPCEPESLEINKYFWIIYAL 

VFLLSLLGNSLVMLVILYSRVGRSVTDVYLL 

NLALADLLFALTLPIWAASKVNGWIFGTFLC 

KWSLLKEVNFYSGILLLACISVDRYLAIVHA 

TRTLTQKRYL VKF1CLSI WGL SLLL ALPVLLFR 

RTVYSSNVSPACYEDMGNNTANWRMLLRIL 

PQSFGFIVPLLIXaFCYGFTLRTLFKAHMGQK 

HRAMRVIFAVVLIFLLCWLPYNLVLLADTLM 

RTQVIQETCERRNHIDRALDATE1LGILHSCLN 

PLIYAFIGQKFRHGLLKILAIHGLISKDSLPKDS 

RPSFVGSSSGHTSTTL 


1045 


2395 


A 


8724 


254 


3184 


FRANLAITVANRRGAQGGKMHTCCPPV 1 LEQ 

DLHRKMHSWMLQTLAFAVTSLVLSCAETIDY 

YOEJCDNACPCEEKDG1LTVSCENRGIISLSEIS 

PPRFPIYHLLLSGNLLNRLYPNEFVNYTGASIL 

HLGSNVIQDIETGAFHGLRGLRRLHLNNNKL 

ELLRDDTFLGLENLEYLQVDYNY1SVIEPNAF 

GKLHLLQ VLILNDNLL S SLPNNLFRF VPLTHL 

DLRGNRLKLLPYVGLLQHMDKWELQLEEN 

PWNCSCELISLKDWLDSISYSALVGDVVCETP 

FRLHGRDLDEVSKQELCPRRLISDYEMRPQTP 

LSTTGYLHTTPASVNSVATSSSAVYKPPLKPP 

KGTRQPNKPRVRPTSRQPSKDLGYSNYGPSIA 

YQTKSPVPLECPTACSCNLQISDLGLNVNCQE 

RKIESIAELQPKPYNPKKjMYLTENYIAVVRRT 

DLLEATGLDLLHLGNNRISMIQDRAFGDLTN 

LRRLYLNGNRJERLSPELFYGLQSLQYLFLQY 

NLIREIQSGTFDPVPNLQLLFLNNNLLQAMPS 

GVFSGLTLLRLNLRSNHFTSLPVSGVLDQLKS 

LIQIDLHDNPWDCTCDIVGMKLWVEQLKVG 

VLVDEVICKAPKKFAETDMRSIKSELLCPDYS 

DVWSTPTPSSIQVPARTSAVTPAVRLNSTGA 

PASLGAGGGASSVPLSVLILSLLLVFIMSVFVA 

AGLFVLVMKRRKKNQSDHTSTNNSDVSSFN 

MQYSVYGGGGGTGGHPHAHVHHRGPALPK 

VKTPAGHVYEYIPHPLOHMCxsJNri i Kokcvjin 

SVEDYKDLHELKVTYSSNHHLQQQQQPPPPP 

QQPQQQPPPQLQLQPGEEERRESHHLRSPAYS 

VSTIEPREDLLSPVQDADRFYRGILEPDKHCST 

TPAGNSLPEYPKFPCSPAAYTFSPNYDLRRPH 

AVT unr a /TiCD T DCPVl V^PP<5 AVFVFPMRNE 

QYLHr LrAOUoK-LKtr vli jrroAvr v nr 
YLELKAKLNVEPDYLEVLEKQTTFSQF 


1046 


2396 


A 


8736 


28 


452 


SPSAAGGLAWVSLALGSGSRGRDHSGSGVUT 
AMAGALVRXAADYVRSKDFRDYLMSTHFW 
GPVANWGLPIAATNDMKKSPEIISGRMTFALC 
CYSLTFMRFAYKVQPRNWLLFACHATNEVA 
OLIOGGRLIKHEMTKTASA 


1047 
1048 


2397 
2398 


A 
A 


8741 
8747 


673 
3 


924 
5054 


" ALPGTPQQTVTLNTDGKVKSFTSPHSNPNLPP 
AKFFTSLQSLNWSSHLPPSPATESVGKRGNAK 
PPTTKLLHSSPLWNFFAQQL 

■ PEVTKPSLSQPTAASPIGSSPSPPVNGGNNAKK 
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VAVPNGQPPSAARYMPREVPPRFRCQQDHK 

VLLKJRGQPPPPSCMLLGGGAGPPPCTAPGAN 

PNNAQVTGALLQSESGTAPDSTLGGAAASNY 

ANSTWGSGASSNNGTSPNPIH1WDKVIVDGS 

DMEEWPCIASKDTESSSENTTDNNSASNPGSE 

KSTLPGSTTSNKGKGSQCQSASSGNECNLGV 

WKSDPKAKSVQSSNSTTENNNGLGNWRNVS 

GQDRIGPGSGFSNFNPNSNPSAWPALVQEGTS 

RKGALETDNSNSSAQVSTVGQTSREQQSKME 

NAGVNFWSGREQAQIHNTDGPKNGNTNSL 

NLSSPNPMENKGMPFGMGLGNTSRSTDAPSQ 

STGDRKTGS VG S WG AARGPSGTDT V SGQ SN S 

GNNGNNGKEREDSWXGASVQKSTGSKNDS 

WDNNNRSTGGSWNFGPQDSNDNKWGEGNK 

MTSGVSQGEWKQPTGSDELKIGEWSGPNQPN 

SSTGAWDNQKGHPLLENQGNAQAPCWGRSS 

SSTGSEVEGQSTGSNHKAGSSDSHNSGRRSY 

RPTHPDCQAVLQTLLSRTDLDPRVLSNTGWG 

QTQIKQDTVWDIEEVPRPEGKSDKGTEGWES 

AATQTKNSGGWGDAPSQSNQMKSGWGELS 

ASTEWKDPKNTGGWNDYKNNNSSNWGGGR 

PDEKTPSSWNENPSKDQGWGGGRQPNQGWS 

SGKNGWGEEVDQTKNSNWESSASKPVSGWG 

EGGQNEIGTWGNGGNASLASKGGWEDCKRS. 

PAWNETGRQPNSWNKQHQQQQPPQQPPPPQ 

PEASGSWGGPPPPPPGNVRPSNSSWSSGPQPA 

TPKDEEPSGWEEPSPQSISRKMDIDDGTSAWG 

DPNSYNYKNVNLWDKNSQGGPAPREPNLPTP 

MTSKSASDSKSMQDGWGESDGPVTGARHPS 

WEEEEDGGVWNTTGSQGSASSHNSASWGQG 

GKKQMKCSLKGGNNDSWMNPLAKQFSNMG 

LLSQTEDNPSSKMDLSVGSLSDKKFDVDKRA 

MNLGDFNDINtRKDRSGFRPPNSKDMGTTDS 

GPYFEKGGSHGLFGNSTAQSRGLHTPVQPLN 

SSPSLRAQVPPQHSPQVSASMLKQFPNSGLSP 

GLFNVGPQLSPQQIAMLSQLPQIPQFQLACQL 

LLQQQQQQQLLQNQRKISQAVRQQQEQQLA 

RMVS ALQQQQQQQQRQPGMKH SPSHP VGPK. 

PHLDNMVPNALNVGLPDLQTKGPIPGYGSGF 

SSGGMDYGMVGGKEAGTESRFKQWTSMME 

GLP S VATQEANMHKNGAIV APGKTRGG SP Y 

NQFDIIPGDTLGGHTGPAGDSWLPAKSPPTNK 

IGSKSSNASWPPEFQPGVPWKGIQKIDPESDP 

YVTPGSVLGGTATSPIVDTDHQLLRDNTTGS 

NSSLNTSLPSPGAWPYSASDNSFTNVHSTSAJC 

FPDYKSTWSPDP1GHNPTHLSN1<>1WKNH1SS 

RNTTPLPRPPPGLTNPKPSSPWSSTAPRSVRG 

WGTQDSRLASASTWSDGGSVRPSYWLVLHN 

LTPQIDGSTLRTICMQHGPLLTFHLNLTQGTA 

LIRYSTKQEAAKAQTALHMCVLGNTTILAEF 

ATDDEVSRFLAQAQPPTPAATPSAPAAGWQS 

LETGQNQSDPVGPALNLFGGSTGLGQWSSSA 

GGSSGADLAGASLWGPPNYSSSLWGVPTVED 

PHRMGSPAPLLPGDLXGGGSDSI 


1049 


2399 


A 


8748 


200 


1387 


' VPWKRQDEQLSLQVETLYLDSPAVIHLLSPlh 
LPPSSLPPFLQIVDSSSSACTLDSFFPFLAPWDS 
PQDCGFKDHQPLTLQALTV-ELARWTLMLLLS 
TAMYGAHAPLLALCHVDGRVPFRPSSAVLLT 
ELTKLLLCAFSLLVGWQAWPQGPPPWRQAA 
PFALSALLYGANNNLVTYLQRYMDPSTYQVL 
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Y=Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, possible 
nucleotide insertion 














SNLKIGSTAVLYCLCLRHRLSVRQGLALLLL 

MAAGACYAAGGLQVPGNTLPSPPPAAAASP 

MPLHITPLGLLLLILYCLISGLSSVYTELLMKR 

QRLPLALQNLFLYTFGVLLNLGLHAGGGSGP 

GLLEGFSGWAALWLSQALNGLLMSAVMKH 

GSSITRLFWSCSLWNAVLSAVLLRLQLTAA 

FFLATLLIGLAMRLYYGSR 


1050 


2400 


A 


8758 1 


3 


1660 


WVSSMGFEELLEQVGGFGPFQLRNVALLALP 

RVLLPLHFLLPIFLAAVPAHRCALPGAPANFS 

HQDVWLEAHLPREPDGTLSSCLRFAYPQALP 

NTTLGEERQSRGELEDEPATVPCSQGWEYDH 

SEFS STLATESQ WDLVCEQKGLNRAASTFFFA 

GVLVGAVAFGYLSDRFGRRRLLLVAYVSTLV 

LGLASAASVSYVMFAITRTLTGSALAGFTIIV 

MPLELEWLDVEHRTVAGVLSSTFWTGGVML 

LALVGYLIRDWRWLLLAVTLPCAPGILSLWW 

VPESARWLLTQGHVKEAHRYLLHCARLNGR 

PVCEDSFSQEAVSKVAAGERWRRPSYLDLF 

RTPRLRHISLCCVWWFGVNFSYYGLSLDVS 

GLGLNVYQTQLLFGAVELPSKLLVYLSVRYA 

GRRLTQAGTLLGTALAFGTRLLVSSDMKSWS 

TVLAVMGKAFSEAAFTTAYLFTSELYPTVLR 

QTGMGLTALVGRLGGSLAPLAALLDGVWLS 

LPKLTYGGIALLAAGTALLLPETRQAQLPETI 

ODVERKSAPTSLQEEEMPMKQVQN 


1051 


2401 


A 


8759 


515 


1625 


EIRTPVAVSSAPSGDSEGDEEETTQDEVSSHTS 

EEDGGVVKVEKELENTEQPVGGNEWEHEV 

TGNLNSDPLLELCQCPLCQLDCGSREQLIAHV 

YQHTAAWSAKSYMCPVCGRALSSPGSLGR 

HLLIHSEDQRSNCAVCGARFTSHATFNSEKLP 

EVLNMESLPTVHNEGPSSAEGKD1AFSPPVYP 

AGILLVCNNCAAYRK1XEAQTPSVRKWALRR 

QNEPLEVRLQRLERERTAKKSRRDNETPEERE 

VRRMRDREAKRLQRMQETDEQRARRLQRDR 

EAMRLK^ANETPEKRQARLIREREAKRLKRR 

LEKMDMMLRAQFGQDPSAMAALAAEMNFF 

QLPVSGVELDSQLLGKMAFEEQNSSSLH 


1052 


2402 


A 


8763 


1106 


70 


RHGHGGRDRRGGGRVARPGGLGRYPGRGAA 

ASLVFVPTRRRSGPSGTASVAAMAYHSGYGA 

HGSKHRARAAPDPPPLFDDTSGGYSSQPGGY 

PATGADVAFSVNHLLGDPMANVAMAYGSS1 

ASHGKDMVHKELHRFVSVSKLKYFFAVDTA 

YVAKKLGLLVFPYTHQNWEVQYSRDAPLPP 

RQDLNAPDLYIPTMAFITYVLLAGMALGIQK 

RFSPEVLGLCASTALVWVVMEVLALLLGLYL 

ATVRSDLSTFHLLAYSGYKYVGMILSVLTGL 

LFGSDGYYVALAWTSSALMYFIVRSLRTAAL 

GPDSMGGPVPRQRLQLYLTLGAAAFQPLIIY 

WLTFHLVR 


1053 


2403 


A 


8768 


2 


712 


RPPRVWYPELRELSAAAPRWSHRTAPGIMVF 

YFTSSSVNSSAYTIYMGKDKYENEDLIKHGW 

PEDIWFHVDKLSSAHVYLRLHKGENIEDEPKE 

VLMDCAHLVKANSIQGCKMNNVNVVYTPW 

SNLKKTADN1DVGQIGFHRQKDVKIVTVEKK 

VNEILNRLEKTKVERFPDLAAEKECRDREER 

NEKXAQIQEMKKREKEEMKKKREMDELRSY 

SSLN1KVENMSSNQDGNDSDEFM 


1054 


2404 


A 


8769 


344 


527 


REATTLACRNSCWVFSRCSLGACKPTVCSMF 
SLSRQGSQTLCLRLAEYCMESVDSQRLLLS 
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/=possiblc nucleotide deletion, \=possible 
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1055 


2405 


A 


8770 


430 


1104 
332 


QQESP AAGAARMNCKJEGTDS SCGCRGNDEK 

KM1 -KC V V V QUO A V UN 1 H^L*JVio I f\iwr\jr r 

YVPTVFDHYAVTVTVGGKQHLLGLYDTAGQ 
EDYNQLRPLSYPNTDVFLICFSWNPASYHNV 
QEEWVPELKDCMPHVPYVLIGTQIDLRDDPK 
TLARLLYMKEKPLTYEHGVKLAKA1GAQCYL 
ECSALTQKGLKAVFDEAJLTIFHPKJCKKKRCS 

EGHSCCSII 

NPRIQLSGNSCCAGSCRVWLSEQ 


1056 
1057 


2406 
2407 


A 
A 


8773 
8778 


261 
3 


477 


PAGIRHEQARGADRMGKCRGLRTARKXRShl 
RRJX?KWHDKQYKXAHLGTALKANPFGGAS 
HAKGIVLEKVGVEAKQPNSAIRKCVRVQLIK 
NGKKITAFVPNDGCLNFIEENDEVLVAGFGR 
KGHAVGDIPGVRFKVVKVANVSLLALYKGK 

KERPRS 


1058 


2408 


A 


8808 


171 


88 1 
50 1 


PGLSQEPSGSMETWIVAIGVLA 1 IhLAS* aal 

VLVCRQRYCRPRDLLQRYDSKPIVDLIGAME 

TQSEPSELEU)DVVrTNPHIEAILENEDWIEDA 

SGLMSHC1AILK1CHTLTEKLVAMTMGSGAK 

MKTSASVSDnVVAKRISPRVDDVVKSMYPPL 

DPKLLDARTTALLLSVSHLVLVTRNACHLTG 

GLDW1DQSLSAAEEHLEVLREAALASEPDKG 

LPGPEGFLQEQSA1 


1059 


2409 


A 


8809 


246 


757 
381 


MRLQG AJFVLLPHLGPIL V WLFTRT>HMSCi W C 

EGPRMLSWCPFYKVLLLVQTAIYSWGYASY 

LVWKDLGGGLGWPLALPLGLYAVQLTISWT 

VLVLFFTVHNPGLALLHLLLLYGLWSTALI 

WHPINKXAALLLLPYLAWLTVTSALTYHLWR 

DSLCPVHQPQPTEKSD 

PKLSVYPLQSHHCLSEPFQSLVCCLA 


1060 
1061 


2410 
2411 


A 
A 


8810 
8820 


304 
1673 


848 


SCKTENLLEMWWFQQGLSFLPSALV1WTSAA 

FIFSYITAVTLHHIDPALPYISDTGTVAPEKCLF 

GAMLN1AAVLC1ATTYVRYKQVHALSPEENV1 

IKLNKAGLVLGILSCLGLSIVANFQKTTLFAA 

HVSGAVLTFGMGSLYMFVQTILSYQMQPKIH 

GKQVFWIRLLLVIWCGVSALSMLTCSSVLHS 

GNFGTDLEQKLHWNPEDKGYVLHMTTTAAE 

W SMSFSFFGFFLTYIRDFQKJSLRVEANLHGL 

TLYDTAPCPINNERTRLLSRDI 


1062 


2412 


A 


8824 


1 


763 


GGAPPASVPARESPVSGAQGSSRTRGHKKAA 

GARAPQLCSSWQRRSAPAMSRGLQLLLLSCA 

YSLAPATPEVKVACSEDVDLPCTAPWDPQVP 

YTV S WVKLLEGGEERMETPQEDHLRGQHYH 

QKGQNGSFDAPNERPYSLKIRNTTSCNSGTYR 

CTLQDPDGQRNLSGKVILRVTGCPAQRKEET 

FKKYRAEIVLLLALVIFYLTLIIFTCKJARLQSI 

FPDFSKAGMERAFLPVTSPNKHLGLVTPHKT 

ELV 


1063 


2413 


A 


8826 


147 


627 


CETSTSSAGHAPCRHAAQGPPAEPTGLRLCSE 
HQRLHAWPPGPRRPSLWPPKNGKWHSGKRT 
AGGRPQRRPSRRQSQRPSAWSGSPRMHSPGQ 
KCSLMCPHRSQDSLSTAIFQRSPGANTGRALH 
citxa'cv/adci m cvil-TT O^KRKTTHFVL 

TR 


1064 


2414 


A 


8835 


2982 


1869 


LKDTLKSQMTQEASDEAEDMKEAMNRMlUt 

LNKQVSELSQLYKEAQAELEDYRKRKSLEDV 

TAE^THKAEHEKLMQLTNVSRAKAEDALSE 

MKSQYSKVLNELTQLKQLVDAQKENSVSITE 

HLQVITTLRTAAKEMEEKJSNLKEHLASKEVE 
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VAKLEKQLLEEKAAMTDAMVPRSSYEKLQS 

EVSQVKREKENIQTLLKSKEQEVNELLQKPQ j 
QAQEELAEMKRYSESS SKLEEDKDKKINEMS 
KJEVTKLKEALNSLSQLSYSTSSSKRQSQQLEA 
LQQQVKQLQNQLAECKKQHQEVISVYRMHL 
LYAVQGQMDEDVQKVLKQILTMCKNQSQK 

K 


1065 


2415 


A 


8841 


3 


"663 


AAATAASLSPRGCRLRTPSSDVGPSRAPPPSA 

APLPTGRAQMSPSGRLCLLTIVGLILPTRGQTL 

KDTTSSSSADATIMD1QVPTRAPDAVYTELQP 

TSPTPTWPADETPQPQTQTQQLEGTDGPLVT 

DPETHKSTKAAHFTODTTTLSERPSPSTDVQT 

DPQTLKPSGFHEDDPFFYDEHTLRKRGLLVA 

AVLFJTGIULTSGKCRQLSRLCRNHCR 


1066 


2416 


A 


8853 


3806 


2204 


FVGEQEGGCEAGAGRGAQTYPGEAGERV/hCi 

RRRRRGRWSRKJCMSLKSERRGIHVDQSDLL 

CKKGCGYYGNPAWQGFCSKCWREEYHKAR 

QKQIQEDWELAERLQREEEEAFASSQSSQGA 

QSLTFSKFEEKKTNEKTRKVTTVKKFFSASSR 

VGSKKEIQEAKAPSPSINRQTSIETDRVSKEFIE 

FLKTFHKTGQEIYKQTKLFLEGMHYKRDLSIE 

EQSECAQDFYHNVAERMQTRGKVPPERVEKI 

MDQIEKY1MTRLYKYVFCPETTDDEKKDLAI 

QKRIRALRWVTPQMLCVPVNEDIPEVSDMW 

KAITDHEMDSKRVPRDKLACITKCSKHIFNAI 

KITKNEPASADDFLPTLIYIVLKGNPPRLQSNI 

QYITRFCNPSRLMTGEDGYYFTNLCCAVAFIE 

KLD AQSLNLSQEDFDR YMS GQTS PRKQb Ah a 

W SPD ACLGVKQMYKNLDLL SQLNERQERIM 

NEAKKLEKDLIDWTDGIAREVQDIVEKYPLEI 

KPPNQPLAAIDSENVENDiaPPPLQPQVYAG 


1067 
1068 


2417 
2418 


A 
A 


8855 
8856 


1372 
1530 


1513 
1583 


SNMREVGCGWLVPVIPAFWEAEVGGSLbARS 

LRQAWATKQDPISKKK 

PCRPGMECNSMISVHCNL 


1069 
1070 


2419 
2420 


A 
A 


8857 
8866 


1530 
293 


1583 
1675 


PCRPGMECNSM1SVHCNL 

PYPQGGYPQGPYPQEGYPQGPYPQGGYPQUF 

YPQSPFPPNPYGQPQVFPGQDPDSPQHGNYQ 

EEGPPSYYDNQDFPATNWDDKSIRQAFIRKVF 

LVLTLQLSVTLSTVSVFTFVAEVKGFVRENV 

WTYYVSYAVFFISLIVLSCCGDFRRKHPWNL 

VALSVLTASLSYMVGMIASFYNTEAVIMAVG 

ITTAVCFTWIFSMQTRYDFTSCMGVLLVSM 

WLFIFAILCIFIRNRILEIVYASLGALLFTCFLA 

VDTQLLLGNKQLSLSPEEYVFAALML Y 1 DUNl 

FLYTLTriGRAKE^PSSSSLCPLRWHGWPGPCP 

WHGSASCTSPLSCPQAQPREKDASLQPSCMY 

TADTSIWTRCGHSN4APLVLPPPPRGTKATFPC 

HLLSTHCCMSPVCQrl PG 1 bub 1 K£>kueaj lov 
EVRVHVFPPVPAPQPGVEHPSPPPHPPGVLPS 

GDMRSGGLIPVLSPE 


1071 


2421 


A 


8868 


2 


358 


ARGNTL YHLPRLCRKLN LR WFS ASTL Y D V QH 
DDKMGSNTFFKRNDCRYVM1SCKADMAYDN 
VRHPFM1 * SIXKXIMEETYLNIIKAVYDRPTASII 
LNGEKLKVFPVRSGT*QGCSVWP 


1072 


2422 


A 


8870 


33 


658 


MESVLSKYEDQITIFTDYLEEYPDTDELVW1L 
GKQHLLKTEICSKLLSDISARLWFTYRRKFSPI 
GGTGPSSDAGWGCMLRCGQMMLAQAL1CRH 
LGRDWSWEKQKEQPKEYORILQCFLDRKDC 
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CYSIHQMAQMGVGEGKSIGEWVLGPNTVXAQ 
G V* KNLA\LFDEW\NSLGLVYVSM\DNPSGS1A 
RFPKKLCRVLPLVSADTAGLTGP 


1073 


2423 


A 


8879 


146 


412 


DPS V* GDVDIEVTCPICLQLLTEPLSLNCGLRL 
*QVCITA*IKESV11SGG*SSSPVCHTTFQPANL 
RTSRYLPT* SIKSLGPDEPQEG 


1074 


2424 


A 


8884 


67 


435 


HLQGRSIRTLQLTGENEKNCEVSERIRRSGPW 
KEISFGDYICHTFQGDCWADRSPLHEAAAHG 
RLLALKTLIAQGVNVNLWTL/DRVSSLHEACL 
* GP V AC AKPY WKMVPRHGGTVTGPPLLMV 


1075 


2425 


A 


8896 


1294 


248 


RSGDRNGLTHQLGGLSQG SRNQS YRSRSRSR 

SRERPSAPRGIPFASASSSVYYGSYSRPYGSDK 

PWPSLLDKEREESLRQKRLSERERIGELGAPE 

V^VGLSPKNPEPDSDEHTPVEDEEPKKSTTSAS 

TSEEEKiCKXSSRSKERSKKRRKKKSSKRKHK 

KYSEDSDSDSDSETDSSDEDNKRRAKKAKKK 

EKKXKHRSKKYKKXRSKKSRKESSDSSSKES 

QEEFLENPWKDRTKAEEPSDLIGPEAPKTLTS 

QDDKPLNYGHALLPGEGAAMAEYVKAGKRI 

PRRGEIGLTR*RNCHHLNAQ VM* * WSRHRR 

MEAVRTAKREPESTVLMRREPLHPFNPRRET 

KERE 


1076 


2426 


A 


8899 


146 


789 


GRSTEAEKEPAFDERTGKGRRLPRAGEEHCj* b 

* APGPGPRSFQ VSRKMPEEVPPGARKHPFSGKS 

FYLDLPAGKNLQFLTGAIQQLGGVIEGFLSKE 

VSYIVSSRREVKAESSGKSHRGCPSPSPSEVR 

VETSAMVDPKGSHPRPSRKPVDSVPLSRGKE 

LLQKAIRNQK**CTVQQLSHCRLY\GEKTTAK 

RSOREHVQQQSQEHGKWPDLKGPR 


1077 


2427 


A 


8901 


352 


3 


AKIGAYKY1QELWRKKQSDVMHFLLRVRCW 
QYPALHRAGTEWQLSALHRAPRSTQPDKAC 
RLGYKAKQGYIIYRICVRRGGWKCPVPKAVT 
\YGKPVHHGVN*LKFAQSLQSVAEEQ 


1078 


2428 


A 


8905 


536 


781 


ACP AENREVPEMAAGQAPHAGPGAGPGQP A 
PALPFAATPGSRGQALCRGGRRRQHLHGPLH 
RP* QAAPALHAGCQL APHPPT 


1079 


2429 


A 


8912 


121 


376 • 


' NLIWKLCVTERRLVILDNYDLASEA^EANKYI 
CNRHQFKPGQDKYFTLGLPTGSTPL*CYPKLI 
EYNKNGHLSFKYVKTFSMDEY 


1080 


2430 


A 


8920 


381 


1788 


SSESPSDPGRMAMTW1VFSLWPLTVFMGHIG 

GHSLFSCEPITLRMCQDLPYNTTFMPNLLNHY 

DQQTAALAMEPIWMVNLDCSRDFRPFLCAJL 

YAPICMEYGRVTLPCRRLCQRAYSECSKLME 

MFGVPWPEDMECSRFPDCDEPYPRLVDLNLA 

GEPTEGAPVAVQRDYGFWCPRELK1DPDLGY 

SFLHVRDCSPPCPNMYFRREELSFARYFIGLIS 

CCLSATLFTFVTFLIDVTRFRYPERPIKCYAV 

WHMMVSLIFF\IGFLLEDRVACNA\SIPAQYKA 

STVTQGSHNKACTNfLFNflL YFFTMAG S V W W 

VILTITWFLAAVPKWGSEAIEKKA1XFHASA 

WG1PGTLTIILLAMNK1EGDNISGVCFVGLYD 

VDALRYFVLAPLCLYVWGVSLLLAGIISLNR 

\m rcTDi tWKinTWn VKFMTRIGVFSILYLVPLL 

WIGCYFYEQAYRGIWETTWIQERC 


1081 
1082 


2431 
2432 


A 
A 


8922 
8923 


56 
355 


420 
1079 


EFRTKMSTGPDVKATVGDISSDGNLNVAQEE 
CSRKGIVDEFFPLLSN*CrWTQPQGYPQSSYG 
TLANFVRCSVRHGLALILQLCNFSIYTQQMN 
LS1AIPAMVNNTAPPSQPNASTERPST 
" PFGTTSSTMAVVKNKCLMKGGKKG VKKK V V 
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GPFSKKDQYDVKAPAMFNIRNTGKyTLVART 

QGTQIASDGLKGLLFEVSLADLQNDEVAFRK 

FKLITEDVQDKNCLTOFYGMDLTCDKJCSMV 

uv;\i/<sTMTF AHVDVKTTDGYFFHLFCVGFTKK 

HNNQILKTSYA*HQQS/RQIQKKMMEIMT*EV 

QTNDLKEVVNKLIPDN1GKDTEKV/CPIYPLH 

DVFIRKVKMLENPGFERVMELRGGGSSS 


1083 


2433 


A 


8948 


28 


385 


LTWPQPHIPSCPAMSEETLQSKI^AAAKKKLP 
WGAVQGSRAMSDLLLLLLDLTLLLLXMLLGF 
AGYSGQLAGVAVSAGSPPI/RYKFHVEPYGET 
r.un T T/F^rSTSPKI CSLAVH*DNPAWF 


1084 


2434 


A 


8950 


156 


318 


HYTPINTDTIEN SENNKC W* G Y *E\VGLIHHW 
wnovRvnPFWK R VWOKRTLNLRV 


1085 


2435 


A 


8956 


16 


413 


HMGQLG YFIQC W WECKRLISFNWKTI* QSPAK 
* TIYTS YDT AIPI S/GI/YPKRMSSKCHQETC AR 
a PFTATUCnKOI TCPLVEERIDYXMWYS 
HKYYTKVKRNL*VTITH\TWVNLNILMFEIILW 
YSHKYY 


1086 


2436 


A 


8962 


868 


1026 


H*KILQVGRAQRAHXSRL*SQLLRRLRHESHL 
NPGARGCSEARLHRCTPAWTT 


1087 


2437 


A 


8985 


58 


330 


i mrvui nviPOT VF^FVlCHCTl MPVS*ELORL 

♦ERSVCAFHVCIQTYVCLQVYACMCVYYICM 
FVYSVYGCGLCTCVCMDVYICVCVQEFL 


1088 


2438 


A 


8989 


394 


404 


KJ^LDMMSNA*STKKHDKLD/L1KFKT/LCSA 
KYTVKJUKJHPTDLEKMLRNHLSDKD*YS/GV 
YKDLSKLNRJIKTE/S*/VKKWVKDLSRYFIKE 


1089 


2439 


A 


8991 


60 


329 


MALTPESPSSFPGLAATGSSVPEPPGGPNATL 
NSSWDSPTEPSSLEDLEATGT1GTLLSDMGW 
GVEDNAYTLEVNSRYMRAVGIM*IHL 


1090 


2440 


A 


8996 


2 


351 


qxtttttt T*MK^ YTWTFCW*GCGOIG/T/LIYC 
WQESKFIQAFWSKIQQYLAnSIHILFDPAFLFL 
GGYPGGTQSVFLTGVLVSSVFYNMKMLHTR 
t 1 TAA1 FITVOYWKOSKDHY1 


1091 


2441 


A 


8997 


97 


456 


YPLPVCSYLSGPRGEHWNSLGGKSSCPLPLK1 
LVSSRFKISKVIWGDLSVGKTCLINR*GGAG 
AELGRVGPSLARWAGSRSQHLVPSQWCKDS 
FDKNYKAPIGADFEMERFEVLGIPF 


1092 


2442 


A 


8999 


548 


811 


S SF1KRHILIFEDD WHQTTCCHOTHHP\F* RCJQ 
FHIFYVSVQNSISPSLSVSSSHPDRPDHEVHQH 
RAAHHHQHGQGPLGHGLVARVG 


1093 


2443 


A 


9002 


3 


2745 


ALLGLQQPAQSULSRSSVMGVRGLQGFVGS 

TCPHICTVVNFKELAEHHRSKYPGC 1 P 1 1 WD 

AMCCLRYWYTPES WICGGQWRE YFS ALRDF 

VKTFTAAGIKX1FFFDGMVEQDKKDEVAAKRR 

LKNNREISRJFHYIKSHKEQPGRNMFFIPSGLA 

VFTRFALKTLGQETLCSLQEADYEVASYGLQ 

HNCLG1LGEDTDYLIYDTCPYFSISELCLESLD 

TVMLCREKLCESLGLCVADLPLLACLLGNDU 

PEGN1FESFRYKCLSSYTSVKENFDKKGNIILA 

VSDH1SKVLYLYQGEKKLEEELPLATXQSSFL 

•RNGIISFTRT/INLHGFSKNPKV**LWTNK*yP 

RVQTPNPGKKFPCVQMLNPGKKFPCVQALNP 

GEKFPCIHI/PEPRQEVPTCSDPEPRQEVPTCTG 

PESRREVPMCSDPEPRQEVPMCTGPEPRQEVP 

MCTGPEARQEVPMCTDSEPRQEVPMCTDSEP 

RQEVPMYTGSEPRQEVPMYTGPESRQEVPMY 

TGPESRQEVLIRTDPESRQEIMCTGHESKQEV 
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PICTDPISKQEDSMCTHAEINQKLPVATDFEKK. 

LEALMCTNPEIKQEDPTNVGPEVKQQVTMVS 

DTEILKVARTHHVQAESYLVYNIMSSGEIECS 

NTLEDELDQALPSQAFIYRPIRQRVYSLLLED 

CQDVTSTCl^VKEWFVYPGNPLRHPDLVRPL 

QMTIPGGTPSLKJLWLNQEPEIQVRRLDTLLA 

CFNLSS SREELQAVESPFQ ALCCLLIYLFVQV 

DTLCLEDLHAFIAQALCLQGKSTSQLVNLQP 

D YINPRA VQLG SLL VRGLTTL VL VN S ACGFP 

WKTSDFMPWNVFDGKLFHQKYLQSEKGYA 

VEVL/CRTK*ISAHQIPQPEGSRLQGLHEGEQT 

HHWPSPLGLTPRREVGKTGLQLPQDGLWV 






A 


9021 


97 


834 


AREACRAKTDFPGRRFRLWPSCCCRVIVGAE 
T* H\MA£PVSPLKHFVLAKKAITAIFDQLLEFV 
TEG SHFVEATYKNPELDRIATEDDL VEMQG Y 
KDKLSUGEVLSRRHMKVAFFGRTSSGKSSVI 
NAMLWDKLVLPSGIGHITNCFLSVEGTDGDKA 
YLNfTEGSDEKKSVKTVNQLAHALHMDKDLK 
AGCLVRVFWPKAKCALLRDDLVLVDGPGTD 
VTTELDSWIDKFCTKSSTREITNSGSDT 


\095 


2445 


A 


9022 


1 


537 


LVLNSRVEDFVPPEGAGRT.LPFALRPLAACW 
LLHRRARRSSALCPRPRSWGVSGGEGAGARE 
P*ITSSSCCLSAA/SHLSIQSPNMAGARRRIRPQ 
LAKEKIEGCHICTSVTPGEPQVFLGKDKAFTF 
DYVFDIDSQQEQIYIQCIEKLIEGCFEGYNATV 
FAYGQT\GAGKTYTMGTGFD 


1096 


2446 


A 


9029 


1 


285 


FFFFN^CKSPKVPKPGCKEESTGTLFKNTLISL 
GQHSETPSLKKKXLAGYSGMCL* SQ VLRRLRQ 
EDCLSPGGGNCRES* SCPYTPAWITERDPV 


1097 


2447 


A 


9032 


716 


357 


ARSTGFWGEILWCGFLKRSLALSPRVKCSGAI 
LAHCNFRHAGFPPLSCLSLPNRWEYRRPPARP 
GKFFLVFLVETGFQC/G*DGLDLLTSRSACLG 
LPKCWDYRREPAASIIFQTTFFINSK 


1 ftQR 


2448 


A 


9038 


230 


652 


KVWMSCEDINISGSFYRNKLKYLAFLCKRTS 
TNPSQGPYHLWVPSH1FWQTTCGRLPHKTKQ 
G* AALDHLKVFDRIPLPYDKKKQMAVS ATLE 
WRPKP* RKFAYLGHWAQKVD WKYQAMTA 
TMGEKRKVYYQKICYQKK 


1099 


2449 


A 


9043 


185 


372 


IIFYSHQQCMRV/WQGCGDIETLIHCW*E*K1I 
HSIVWK/TV* QFLKRL YLHLPHNS VIAFLGISP 
RKIKTCTQNSCTSMLINAIHNDQKWKKINI 


1100 


2450 


A 


9045 


763 


584 


RQSLALSPRLECSGTISAHCRLCPLVFTPLSCL 
SLTSSWDYRRPPPHPANFLYFK*RRGF 


1101 


2451 


A 


9050 


275 


2 


LFFLRKVSNQFLSPSLLPVNFQGFVFAFLLLLL 
FLL/FEMESLPVA/RVECSGTISAHCNLCLPGSS 
DSPASAS*VAGITDMCRYTQLILFHAS 


1102 


2452 


A 


9053 


449 


1224 


KTSMFWKTOLHSSSHIDTLLEREDVTLKELM 

DEEDVLQECKAQNRKLEEFLLKAECLEDLVSF 

I\*EEPPQDMDEKJRYKYPN1SCELLTSDVSQM 

NDRLGEDESLLMKLYSFLLNDSPLNPLLASFF 

SKVLSILISRKPEQIVDFLKKKHDFVDLIIKHIG 

TSA1MDLLLRLLTCIEPPQPRQDVLN/WFKVQ 

T>xn *TJCT#XT\r\yTnTQV W"M7 WVL/fiT TsHCSHSLL* 
K JN L no 1 IN V IVlJUloPv I v L,n w ubniwi^^^ 

LLLQCVLQWLNEEKflQRLVEIVHPSQEEDVS 
SLV 


1103 


2453 


A 


9058 


403 


3 


" GLHVYDFQVYREHILTLNVKKCS VSFWGLRE 
WLYLQryTYEnKSPRFPIIKMTDITKCW* GOGA 
AGMQLWCW\WCVNVGKFWEMS*YYLLKLSI 
ST/P>T)PAIPLLGIYL»ETRVYIHPKTCMRML1A 
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APFVLAVNC i 


1104 


2454 


A 


9064 


75 


393 


KWLFSSLNITGRGDnGHLKWLDCR\NCSSFPI 
KRNRQTHSTESNKLKAGHSFGYN*LIH*NS\V 
KTDCGCGANSKGWWMKWCTAQQKQTTS 
YMQIGTTKNSRAT 


1105 


2455 


A 


9065 


366 


778 


DLLILRNLAFPELKRRNCISRFYLAYHLHKIYS 
RSELLCNNCSGFYILSL*QYDVFFFNYFFFRDR 
AWPCCPGWSAAWLTIYILAHYRRPGLERSCC 
LSLSSSWDHRRVPPCPANF*/YFSMGFTAFPRL 
VLNS'TQGl 


1106 


2456 


A 

A 




o / J 


816 


ESGSLIH* WWENKPAQPLWWEI*QHVQKLPT 
HFPCDPAIPLLGICPED 


1107 


2457 


A 


9086 


580 


18 


KPSSGSFIRAIYIFLSTAHVPALFSVLVRTKLT* 

AFSQSSVLWAHKQQKTSLSLVIR/ERLQIKTA 

VRENFLPDU^AKILKLDNVKCWQG/SGSNMSL 

1/HCWWEYNVmnWNSVTFPRKVEHVYITYA 

PEIS VR* IHGGLPTL VHQETHTS VFRG APS VIP 

ETR\CRFTKESINKLLHIYTMEHYGDENK 


1108 


2458 


A 






1 


GGNDCSVTPTTEPGRKEIT*KRKJF* EKTDRLP 
GA/PPSRTPPTPYPCPHGDRLLPPSRPLPAGPA 
S AFPPAERSRGHRRASL* RARWS AA VPRRS A 
GSASEPVQSRWLRLPVGSDSPPAVPVRVCPAP 
DSRPAAPGSRLPDPGLDSPAPSRTPSSSVD* GO 
QRPPPPSGDSLSPPGCCRY 


1109 


2459 


A 


9099 


1255 


1425 


HESYHVNPNLCNPVAPTSGAHSIG*KWPSWL 
GAVAHSCNPSTLVGRGGRITRGQELR 


1110 


2460 


A 


9103 


242 


70 


EEQFFFFAVGMFP*VDFLAPASGELWDRLRLT 
CSRPFTRHQSFGLAFLRVCSSLDSLDDSWGP 
SALLSSVL/NQGGRNVLEAREAAKHPTI*RQS 
LLRKQRNKRMAIP 


1111 


2461 


A 


9110 


189 


121 


SFLSVRLECNGAIMAHCALPLPG 


1112 


2462 


A 


9113 


100 


910 


RRRGGGS RPRRTPVPAPGPGPSFGMDVRFYP 

AAAGDPASIJ)FAQCLGYYGYSKFGNNNNYM 

NMAEANNAFFAASEQTFHTPSLGDEEFEIPPIT 

PPPESDPALGMPDVLLPFQALSDPLPSQGSEFT 

PQFPPQSLDLPSITISRNLVEQDGVLHSSGLHM 

DQSHTQ VSQYRQDPSLIMRXPSST* PDAARSG 

VMPPAQLTTINQSQLSAQLGLNLGGASMPHT 

SPSPP ASKS ATP SPSSSINEEDADEANRAIGEK 

RAAPDSGKKPKTPKK 


1 1 1 1 




A 

r\ 


qi ?n 


3452 


3051 


FLRPSFALVPQAGVQWCALSWLQPPSPRFK*F 
SCLSLPSSWDYRHVPPRPANFFVLLVETGFLH 
VGQAGHEPLTSGDPPASASQSAGITGVSHQA 
WPSFFIFSRDTVLLCCS G WSRTSGLKQSACLS 
LLKCWDY 






A 


9122 


152 


377 


NQLPLQQWTFFIYETGFCSVAQAGVQCRDHS 

SLHP*PPG\SSDPPAPPS*VLGITGQRYHACLn 

YLYVQTVPQRV 


1115 


2465 


A 


9124 


553 


981 


QRPLLRQQLGSWPTCRSLEGDLASPW**RLPG 

SPRMRRSGT/ATLNLPLSPQGTVRTAVEFQVM 

TQTQSLSFLLGSSASLDCGFSMAPGLDLISVE 

WRLQHKGRGRGDLHLPDHHLSVPSSADHPA 

QQPSQFNGRNLYFLPLFR 


1116 


2466 


A 


9135 


48 . 


410 


SASHEPAEHDGGADSLSASQPPRPAGRPAGA 
QHVHVPPWTDVLAGQDRRAPTAGDGAPWP 
APGGHVP STRPHDP AEFHADEAAGRGGRGLQ 
PAAPHALPAGLPHGPPAPA/PAEGGGTP*GSA 
GAGGP* GSPAGRACG AAGCRPRPPRP AASS A 
* NS AGS * GL VEGT* PPG AGHG APSP AVG ARLS 



306 



WO 01/57188 



PCT/USO 1/03800 



SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 

USSN 
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nucleotide 
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ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 

t<-> 1 np| *arr»mn 
10 IBS I dllllllO 

acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, 
D=Aspartic Acid, E=GUitamic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine 7 
l=Isoleucine, K=Lysine, l=Leucine, 
fvf=Methionine, N— Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threontne, V=-Valine, W^Tryptophan, 
Y^Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, V=possible 
nucleotide insertion 














CPARTSVQGGTWTC*APAGRPAGLGGWEAE 
RESAPPSCSAGS*DAD*OAEPWQAGSRSWGS 


1117 


2467 


A 


9141 


380 


939 


KSGHWAKECLQPRIPPRPCPICVGPHWKSDCP 
TCPGAVPRAPGTLPQGSLTDSFPDLLSLVAED 
* CCLMASEAS WTIT\ELWVTLTVEGKS VP/CL 
KTFATHST1 PSFOGPVSLASITWGEDGOASKP 
LKTPQLWCQLGQYSFMHYFLVIPTCPVPLLG* 
GILTKLSAFLTIPRLQPHLIAALSPSS 


1118 


2468 


A 


9154 


471 


2 


AAGQWVEVTSHLYLCITSDAAGLRLLPPAES 
ERGEGGHCPAEAPLPPRPQYCLAKHPLLRKLP 
EEKIKL DP YLTQHTKINSKQK YLS/VRAKTTQ 
LVEGNIG VNLQNTELKQH * 3NGFLDTTPEAQE 
TKJEKTNKLNFIKKVKRQLAEWEKIFQIA 


1119 


2469 


A 


9155 


2 


3187 

207 


ACPRLARRRRRWSLRRRRGWLRARWSRGQ 

NNMAARRJTQETFDAVLQEKAKRYHMDASG 

EAVSETLQFKAQDLLRAVPRSRAEMYDDVHS 

DGRYSLSGSVAHSRDAGRESLRSDVFSGPSFR 

SSNPSISDDSYFRKECGRDLEFSHSNSRDQVIG 

HRKLGHFRSQDWKFALRGSWEQDFGHPVSQ 

ESS WSQE YSFGP S A VLGDFG S SRLDEKECLEK 

ESRDYDVDHPGEADSV/LRGGSQVQARGRAL 

NIVDQEGSLLGKGETQGLXTAKGGVGKLVTL 

RNVSTKKIPTVNRITPKTQGTNQIQKNTPSPD 

VTLGTNPGTEDIQFPIQKIPLGLDLKNLRLPRR 

KMSFDIIDKSDVFSRFGIEIIKWAGFHTIKDDIK 

FSQLFQTLFELETETCAKMLASFKCSLKPEHR 

DFCFFTIKFLKHSALKTPRVDNEFLNMLLDKG 

AVKTKNCFFEIIKPFDKYIMRLQDRLLKSVTP 

LLMACNAYELSVKMKTLSNPLDLALALETTN 

SLCRKSLALLGQTF SL AS SFRQEKIL* AVGLQ 

DIAPSPAAFPNFEDSTLFGREYIDHLKAWLVS 

SGCPLQVKXAEPEPMREEEKMIPPTKPEIQAK 

APSSLSDAVPQRADHRVVGTIDQLVKRVTEGS 

LSPKERTLLKEDPAYWFLSDENSLEYKYYKL 

KLAEMQRMSENLRGADQKPTSADCAVRAML 

YSRAVRNLKKKLLP\WQRRGLLRAQG\LRG\ 

WKARRA\TTGTQTLLFLRAPGLKHHGRQAPG 

LSQAKPSLPDRNDAAKDCPPDPVGPSPQDPSL 

E A S GPSPKPA GVDISEAPQTS SPCPS ADIDMKT 

METAEKLARFVAQVGPEIEQFSIENSTDNPDL 

WFLHDQNSSAFKFYRKKVFELCPSICFTSSPH 

NLHTGGGDTTG SQESP VDLMEGEAEFEDEPP 

PREAELESPEVMPEEEDEDDEDGGEEAPAPG 

GAGKSEGSTPADGLPGEAAEDDLAGAPALSQ 

ASSGTCFPRKRISSKSLKVGMIPAPKRVCLIQE 

PKGECPPVGTVASSTVLGWWAVRVRRDRWR 

HFNPKEFC APLQNVSRH SCFP W 

PPRACRPCPRACPCPPT+KCSQPVSWPC 


1120 
1121 


2470 
2471 


A 
A 


9163 
9166 


124 
272 


523 


PMSSLQGCFYTFKCIIFKGIFLLLISNLIAF* *EK 

V/CSHITDSLKFIGKGWVGMVTHACNPGTLG 

G*GGWIA*VREFETSLGNM 




2472 


c 


9170 


442 


236 


" MNRRRFLRPADCHSGMRGTENGACSEGESQI 
HCGAGGEGVQLVHWNQPENGCLQFDSTHIT 
FSKRON* 


1123 


2473 


A 


9171 


10 


423 


MVDRSPLLTSVlIFYLAlGAAIFEVLEEPHWKJi 
AKJC>mTQKLHLLKEFPCLGQEGLDKILEVV 
SDAAGQGVAITGNQTFKNWNWPNAMIFAAT 
VITTIGYGNVASKTPGGRLFCGFYGLFGVPFC 
LTWINALGKFFG 
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SEQ ID 
NO: of 
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seq- 
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hod 
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ID NO: 
in 
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beginning 

nnrlenfide 
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correspondi 
ng to first 

amino ftcid 
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sequence 
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nucleotide 
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to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A«=Alanine OCysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, GOIycine, H=Histidine, 
l=lsoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T-Threonine, V-Valine, W-Tryptophan, 
Y=Tyrosinc, X= Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possibie 
nucleotide insertion 


1124 


2474 


A 


9173 


3 


374 


GPSPSLLVLLPQEPGGTGTPVRAGAGACiMWL 
WEDQGGLLGPFSFLMLMLLLETRNPVNACLL 
TGSLFVLLGVFSFEPVPSCRALQELKPRDRISA 
lAHRGGRHDPPENTLGAIKA^i'* waiNKK 


1125 


2475 


A 


9179 


704 


188 

233 


ESSSGLLFQCFQGIHVQKLTLQARPTLFSWWL 

CSKPPKETGELENAESGGDGGRRGGKQDNV 

AWWRRM\QKG\DFPWDDEDFPQSGPFGGQA 

LPMGFFYLYFRDPGREITWKHFVQYYLARGL 

VDRLEWNKQSVRVIPAPGTSSEVRGEFKAE 

YCRHKFISCKNWFYFFQ 

MEYMAESTDRSPGHILCCECGVPISPN 


1126 
1127 


2476 
2477 


A 
A 


9183 
9185 


153 
1 


321 


LTGQLG SILLR VFSKSRAGLG ARKLKA YRTM 
EYMAESTDRSPGHILCCECGVPISFNPAQY\CV 
ACLRSSFHIYHCIPKLFIHPFSKTSSSAFITPSHY 
LTFFSTIS 


1128 


2478 


A 


9186 


183 


84 / 


"VI KF1 1 LOTMDEOSOGMOGPPVPQFQPQKAL 
RPDMG YNTL ANFRIEKKJ GRGQ\FSE VYRAAC 
LVLDGVPVALKKVQIFDLMDAKARADCIKEID 
LLKQLNHPNVIKYYASFIEDNELNIVLELADA 
GDLSRM1KHFKKQKRLIPERTVWKYFVQLCS 
ALEHMHSRRVMHRDIKPANVFrTATGWKXG 
DLGLGRFFSSKTTAAHSLVGTPYYMSPERIHD 

NG 


1129 


2479 


A 


9190 


1 


lift 


GTSWKJPSAAVSESSPNGAAYASGLPCGVRG 
PPWAGLALLPSPTLMALLRRPTVSSDLDN1DT 
RATTuXmVVATITRARIEDMRHSATALTRPD 
ATTAQIPKLPVTTVCNRRANPGIPPSVL 


1130 


2480 


A 


9194 


131 


487 


AYLKRLPVPESITGFARLTVSEWLRLLPFLGV 
LALLGYLAVRPFLPKKKQQKDSLINLKIQKEN 
PKVVNEINIEDLCLTKAAYCRCWRSKTFPAC 
DGSHNKHNELTGDNVGPLILKKKE 


1131 


2481 


A 


9201 


184 


605 


" KELVDEKSERGRAMDPVSQLASAGTFRVLKJi 
PLAFLRALELLFAIFAFATCGGYSGGLRLSVD 
CVNKTESNLSIDIAFAYPFRLHQVTFEG\PTCE 
GKERHKL ALIGDS S S S AEFFGTV AGF AFLYSL 
AATGVYIFFQNKY 


1132 


2482 


A 


9206 


1 


852 


GG GRAGAGSRDMGSTD SKLNFRKA V1QLT1 K 

TQPVEATDDAFWDQFWADTATSVQDVFALV 

PAAEIRAVREESPSNLATLCYKAVEKLVQGA 

E S GCH S EKEKQI VLN C SRLLTR VLP YIFEDPD 

WRGFFWSTVPGAGRGGQGEEDDEHARPLAE 

SLLLAIADLLFCPDFTVQSHRRSTVDSAEDVH 

SLDSCEYIWEAGVGFAHSPQPNYIHDMNRME 

LLKLLLTCFSEAMYLPPAPESWQH/RTHWFSS 

FVSSENRHALPLFTSLLNTVCAYDPVEYGIPY 

NHLY 


1133 


2483 


A 


9208 


1165 


1463 


GPRARVQGFSGADrVKPMALGSMYLVLTUV 
AKVLRGAEPCCGPLKNRVLRPCPLP/VPLPPP 
HPQPSRGNPVGCLPTYKWYKLLSWPLHSNS 
NVYFIV 


1134 


2484 


A 


9210 


66 


1586 


MAGAGPKRRAl^APVAEEKEEAREKJMAAK 

RADGAAPAGEGEGVTLQGMTLLKGVAVIW 

AIMGSGIFVTPTGVLKEAGSPGLALVV WAAL 

GVFSIVGALCYAELGTTISKSGGDYAYMLDV 

YGSLPAFLKLWIELLIIRPSSQYIVALVFATYL 

LKPLFPTCPVPEEAAKLVACLCVLLLTAVNC 

YSVKAATRVQDAFAAAKLLALALIILLGFVQ1 

GKGDVSNLDFNFSFEGTKLDVGNIVLALYSG 

LFAYGGWNYLNFVTEEMINFYRNLPLAHISLP 
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Amino acid sequence (A=Alanine OCysteine, 
D^Aspartic Acid, E=<31utamic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
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M-=Methionine, N=Asparagine, P=Prolinc, 
QHjlutamine, R-Arginine, S=Serine, 
T=Threoninc f V=Valinc, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 














I VTL VYVLTNL A YFTTLSTEQML S SE A V A VDF 

GNYHLGYMSWIIPVFVGESCFGSVNGSLFTSS 

RLFFVGSREGHLPS1LSMIHPQLLTPVPSLVFT 

CVMTLFYAFSKDIFSVrNFFSFFNWLCVALAn 

G^WLRHRKPEIXRPIKVNLALPVFFILACLF 

LLAVSFWKTTPWSVASDFTIILSGLPVYFFGV 

W\VT<>na ) KWAPPGHLSPRPSCVRSSCMVVPQ 


1135 


2485 


A 


9216 


40 


410 


RDRLPPAYFCRPWCWTALDVGNSPESQEM 
DLVAFEDVAVNFTQEEWSLLDPSQKNLYREV 
MQETLRNLASIGEKWKDQNIEDQYKNPRNNL 
RSLLGERVDENTEENHCGETSSQIPDDTLNK 


1136 


2486 


A 


9223 


3 


983 


RRRRRSRYRRCSRFPRPGPLAVSMPHAFKPU 

DLVFAKMKGYPHWPARIDDIADGAVKPPPN 

KYPIFFFGTHETAFLGPKDLFPYDKCKDKYGK 

PMKRKGFNEGLWEIONNPHASYSAPPPVSSSD 

SEAPEANPADGSDADEDDEGNRGVMAVTAVT 

ATAASDRMESDSDSDKSSDNSGLKRKTPALK 

MSVSKRARKASSDLDQASVSPSEEENSESSSE 

SEKTSDQDFTPEKKAAVRAPRRGPLGGRKKK 

APSASDSDSKADSDGAKPEPVAMARSASSSSS 

SSSSSDSDVSVICKPPRGRXPAEKPLPKPRGRK 

PKPERPPSSSSSD 


1137 


2487 


A 


9229 


21 


239 


" LFPRLECRDPVTVNCTLNLPGSKNAPTTASQV 
GST\WYRGGLPHPTNFFVKTGFRCSQAGLKL 
RGSREPPAWA 


1138 


2488 


A 


9231 


1664 


2 


TRSVGVNTCEVGWTEPECLGPCEPGTSVNL 

EGIVWHETEEGVLVVKVrnVRNKTYVGTLLD 

CTKHDWAPPRFCESPTSDLEMRGGRGRGKR 

ARSAAAAPGSEASFTESRGLQNKNRGGANGK 

GRRGSLNASGRRTPPNCAAEDIKASPSSTNKR 

KNKPPMELDLNSS SEDNKPGKRVRTNSRSTP 

TTPQGKPETTFLDQGCSSPVLIDCPHPNCNKK 

YKHINGLRYHQAHAHLDPENKLEFEPDSEDK 

ISDCEEGLSNVALECSEPSTSVSAYDQLKAPA 

SPGAGNPPGTPKGKRELMSNGPGSnGAKAGK 

NSGKXKGLNNELNNLPVISNMTAALDSCSAA 

DGSLAAEMPKLEAEGLIDKKNLGDKEKGKK 

ANNCKTDKMPSKLKSARP1APAPAPTPPQL1A 

IPTATFTTnTGTIPGLPSLTTTVVQATPKSPP L 

KPIQPKPT1MGEPITVNPALVSLKDICKKJCEKK 

KJLKDKEGKETGSPKMDAKLGKLEDSK.GASK 

DLPGHFLKDHLNKNEGLANGLSESQESRMAS 

IKAEADKVYTFTDNAPSPSIGS 


1139 


2489 


A 


9234 


207 


443 
328 


TRJRGQPWRRRAAAAGILTOREAAACLPSUAS 
VTAAVSGLLVGYELGIISGALLQIKTLLALSC 
HEQEMGVSSLVIGALL 
MAQGNNYGQTSNGVADESPNMLVYRKV 


1140 
1141 


2490 
2491 


A 
A 


9238 
9242 


248 
2 


535 


FVEAAVKMLGSLVLRRKALAPRLLLRLLRSP 

TLRGHGGASGRNVTTGSLGEPQWLRVATGG 

RPGTSPALFSGRGAATGGRQGGRFDTKCLAA 

ATWGRLPGPEETLPGQDSWNGVPSRAGLGM\ 

WPWAAALWHCYSKSPSNKDAALLEAARAQ 

\NMQEVSRNRCALLHSAAVQEYGYGN 


1142 


2492 


A 


9245 


157 


466 


HLCFWFFVGLFLPEQQIMLFATLLRMAQUUD 
FALGNDFLNITTKAQA/TKEKLDKLDFTKIKTC 
CTSMDAIEKTEPLTKWTKAFVSHVSYKRLLF 
GICKEYSRQ 


1143 


2493 

i 


A 


9247 


264 


115 


GLPQQTSTIQPPGTPDGARDFTSTIQPPGAPDU 
ARDSTSIIRMGPEIPPP 
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Y=Tyrosinc, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 


11/1/4 

1144 




A 




\ 


401 


KKWGRLSEMSFSLNKl'lJ'ANTTSSPVTVDCCiP 

SLGLAAGIPLLVATALLVALLFTLIHRRRSSIE 

AMEESDRPCEISEIDDNPKISENPRRSPTHEKN 

TMGAQEAHTYVKTVAGSEEPVHDRYRPTIEM 

ERRR 


1145 


2495 


A 


9264 


175 


411 


METIWIYQFRLIEIGDSTVGKSCLLHRFrQGR>* 
PGLRSPACDPTVGVDFFSRELEIEPGKRIKLLL 
WDTAGQERFISIT 


1146 


2496 


A 


9277 


592 


814 


MFTYLEGREGIKSQPKMEPHSVTVRLECSGMI 

SAHCSLNLPGTSDSPASASRA/AGTTGMRHHA 

WLIFAFLVETGF 


1147 


2497 


A 


9279 


1255 


2 


FRRGRRGEEEKEEEEEEEEGWVNGMENSHPP 

HHHHQQPPPQPGPSGERRNHHWRSYKLMIDP 

ALKKGHHKLYRYDG QHFSLAMSSNRPVEIVE 

DPRWGIWTKNKE\LELSVPKFKIDEFYVDQV 

PPKQVTFAKLNDNIRENFLRDMCKKYGEVEE 

VEILYNPKTXKHLGIAKVVFATVRGAKDAVQ 

HLHSTSVMGNIIHVELDTKGETRMRFYELVLV 

TGRYTPQTLPVGELDAVSPIVNETLQLSDALK 

RLKDGGLSAGCGSGSSSVTFNSGGTPFSQDTA 

YSSCRLDTPNSYG/QGTPLTPRLGTPFSQDSSY 

SSRQPTPSYLFSQDPAVTFKARRHESKFTDAY 

NRRHEHHYVHNSPAVTAVAGATAAFRGSSD 

LPFGTVGGTGGSSGPPFKAQPQDSATFAHTPP 

PAQATPAPGFR 


1148 


2498 


A 


9302 


1026 


6 


1ASIQNADTMPGVGLLVSHFSTLVSRQRCPNY 

ADPQNLTDVSIFLLLEVSGDPELQPVLAGLFL 

SMCLVTVLGNLLIILAISPDSHLHTPMYFFFSN 

LSLPDV\GFTSTTVPK\MIVDI\QSRSRV1SYAG 

CLTQKSLFAIFGGTEEVNMLLSVMAYDRFVAI ! 

CHPLYHSAIMNPCFCAFLVLLSFFFLSLLDSQL 

HSWIVLQraiKNrVEJSNFVCDPSQLLKFACSD 

SIIN S IF IYFHKDPERQL VLAGLFL S M CL VTVL 

GNLIIILDVSPDSHLPTPMYFFLSNLSLPDIGFT 

STTVPKMIVDIQSHGRVIFYAGCLTQMSLFAIF 

GGMEERHAPECDGL 


1149 


2499 


A 


9303 


i 
1 


oyy 


MASQEKDIHGWGTIHLFRKPQRSFFGKLLRE 

FRL V AADRSMGRYMLFGVINLICTGFLLMWC 

SSTNSIALT^YTYLTIFDLFSLMTCLISYWVTL 

RKPSPVYSFGFERLEVLAVFASTVLAQLGALF 

ILKESaERFLEQPEIHTGRLLVGTFVALCFNLF 

TMLSIRNKPFAYVSEAASTSWLQEHVADLSR 

SLCGHPGLSSIFLPRMNPFVLIDLAGAFALCIT 

YMLIEI 


1150 


2500 


A 


9308 


797 


693 


DRSTSVTRAGVQWCSLGSLQPRTPGLLRSSUL 
SLP 


1151 




A 

A 




205 


406 


VAIKELPVLWKWSKPTRMAKEPPQTQQRAU 
SKTAAPPCQWSRMASEGPNIPCPGARHSDKQ 
FLICTI 


1152 


2502 


A 


9314 


913 


504 


KPSPLITPPAVVLPPSAVLNLVNTFbSFPgvtv 
QGPLCGPRKGRLAVTEPFFGLS/LPKYMDHRR 
PPPHR\EIFFVFLAETGFHRASQAGPDLPTS/S/I 
PPTSA/FPKCWEYRSEPQCLPGCLSFSGILLDL 
GTNVSLRAA 


1153 


2503 


A 


9315 


392 


1 


HPHRPRPGFRSPARSSRPCPVLTSLLPPFPSPSP 
PADDLVKAGRDRKDPQVR/ERRLRPNPGRLG 
GPR\PRPARARS/CHQPRLTRVCPRSPPPEARA 
PAPAAPARGRGAPKRNRPRTDTRAPRGSSAR 
PGNS 
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to last amino 
acid residue 
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sequence 


Amino acid sequence (A^Alanine OCysteinc, 
D=Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, GOIycine, H=Histidine, 
l=Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S^Serine, 
T=Threonine, V=V aline, W-Tryptophan, 
Y=Tyrosinc, X= Unknown, *=Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 


1 154 


2504 


A 


9321 


331 


433 


MPCI/QAQYGTPAPSPGPRDHSASDPLTPEFIK 
PT 


1155 


2505 


A 


9324 


180 


215 


MEEPQSDPSVEPPLSQETFSDLWKLLSENNVL 


1156 


2506 


A 


9326 


383 


619 


M1SPSRTEGDPLPLPP/EGEGQEVRGFGGGPAK 

EAAQRHCRASVSILRMRRPGQGSSRPARVPL 

RGPDSHRLREPPPSPP 


1157 


2507 


A 


9327 


152 


292 


YERRGRSQGGGSHPAGAQPGGRAIGAGWQS 
KEPLWEGLQRSGSPLPG 


I I JO 




A 

A 


Q37R 


j 


430 


QELKQGPNPLAPSPSAPSTSAGLGDCNHRVD 

LSKTFSVSSALAMLQERRCLYWLTDSRCFL 

VCMCFLTFIQALMVSGYLSSVITTIERRYSLKS 

SESGLLVSCFDIGNLVVWFVSYFRGRRRRP/ 

RVAAVGGLLDLEGGEMI 


1 159 


25 U9 


A 

A 




1 flR 

1 \JO 


JOJ 


KGNQVNGNGNQLKRKHESMCPVSLTQNTVR 
LMEAGLPQKQAERADELFEAGLVIYVKLDER 
VLNALVYSSVGLQWFKESDLSHLRLLEISFR 


1160 


2510 


A 


9338 


2 


430 


FVGRPRGLSDRLEDLFLAGFRVGERLRTAAM 
KRYVRILLLGEGAEHVADPVPGGRGVPRGEA 
DHTDQELREEIHKANVERVVHDVSQEATIEKI 
RTKWIPLV/RWGDHA/EGPVGIKSYLPSGRSM 
EAELPIMSQLTEIETCVEC 


1161 


2511 


A 


9341 


1 


390 


NSRVDDFVAPGLSEAGKLLGLEFPERQRLAA 
AVG/CSPMSGVISMSAPFFLGKIIDAIYTNPTV 
DYSDNLTRLCLGLSGVFLCGAAANAIRVYLM 
QTSRQRWKRLRTSLFSSILGQEVAFSDKAGT 
GELI 


1162 


2512 


A 


9343 


84 


837 


QGRFRAFCWQRDFLQPPGMRLSALLALASKV 

TLPPHYRYGMSPPGSVADKRKNPPWIRRRPV 

VVEPISDEDV/YLFCGDTVE1LEGKDAGKQGK 

VVQVIRQRNWVVVGGLNTHYRY1GKTMDYR 

GTMIPSEAPLLHRQVKLVDPMDRKPTEIEWR 

FTEAGERVRVSTRSGRIIPKPEFPRADGIVPET 

WTDGPKDTSVEDALERTYVPCLKTLQEEVME 

AMGIKETRNNTRRSIGIEPGAEQLLPNFCPSLE 

G 


1163 


2513 


A 


9346 


967 


616 


DSLALSPRLECSGAISAHCNLTPPGFTPFSCLS 
LPSSWAYRCASPHPDNFFVFLVESGFHHVGQ 
AGLKLLISSDPPTSA/FPKCWDYRRDNSSAPAT 
FSSYQRl^INPDLILNDTIMPNIK 


1164 


2514 


A 


9347 


3 


1099 


SSFPTCMRTVFHSNTSVSSLLHRPGHVTPQLTI 

HGGWRHHRDHTAIDEWDFNPSKFLIYTCLLX 

FSVLLPLRLDGIIQWSYWAVFAPIWLWKLLV 

VAGASVGAGVWARNPRYRTEGEACVEFKA 

MLIAVGIHLLLLMFEVLVCDRVERGTHFWLL 

VFMPLFFVSPVSVAACVWGFRHDRSLELEILC 

SVNILQFIFIALKLDRnHWPWLWFVPLWILM 

SFLCLWLYYTVWSLLFLRSLDWAEQRRTH 

VTMAISW1TIVVPLLTFEVLLVHRLDGHNTFS 

YVSIFVPLWLSLLTLMATTFRRKGGNHWWF 

AIRRDF/CQDQLPQPTGKPPPPPLTDHHGEKA 

LPLQNKDRGSWPASRGSPRLL 


1165 


2515 


A 


9362 


547 


991 


DVSIGPPLLRRPCSGREQTRSLSFPSDPESSFSP 

VPEGVRLADGPGHCKGRVEVKHQNQWYTV 

CQTGWSLRAAKVVCRQLRCGRAVLTVQKRC 

TKHAYGRKPIWLSQMACSGPEPTLHDCPFRP 

LGEDTLFHVEYTSVHGRERLSAKD 


1166 


2516 


A 


9363 


201 


387 


PPILRWTPPSGKNFFFFFFFESEFY/SSPRVECS 
GAISAHLAHCNLCLPGSSDSPASAFQVAS 


1167 | 2517 


A 


9368 


707 


1087 


A VLTPCLSPC SPSRIPRP\SRPYPGRRSLSh 1 PP 
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Amino acid sequence (A= Alanine C=Cysteine 1 

D=Aspartic Acid, E=G!utamic Acid, 

v— PVii»nvlfllaninp fr=Glvcine_ H=Histidine. 
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PRPLILYAPAPVRPAGTAFIPHSHPPPPDLLRPT 
atpa/TPPP<nT PPPPRPI HPTOPSTALLPDPPPW 
PLPFPPPSS/RPPRPDCSTSYSPTFPPPT 


1168 


2518 


A 


9375 


511 


15 


MMLSEETSAVRPQKQTRFNGAKLVWMLKGS 

PITWSAV1IVLMLLMM/IFSPWLATHDPNAID 

t tapt T PP^A AUWFGTDFVGRDT FSRVLVGS 

QQSILAGLVWATTGMIGSPLECLFGELGGRA 

DAIFMRVMDIMRS/1PSLVLTMEKTAALGPSL 

FNAMQASSEH 


1169 


2519 


A 


9377 


42 


410 


GNGRVAPRDPGAVASAEPGLTTHDSGVNPN 
NSARRMEAMASGSNWLSGVNVVLVMAYWS 
LVFVLLFIFAKRQIMRFAMKSLRGPHGPVGH 
NAPKDLKEE1DILLSRVHNIKYEP\HLLADDDA 


1170 


2520 


A 


9378 


302 


1303 


GVSGFSASVLRQRRMEDELEPSLRPRTQIQGR 

ILLLTICAAGIGGTFQFGYNLSIINAPTLHIQEF 

TNETWQARTGEPLPDHLVLLMWSLIVSLYPL 

GGLFGALl^GPLAJTLGRKIvbLLNVNNir V V £> 

AAILFGFSRKAGSFEMIMLGRLASWGVNAGV 

SMN1QP\MLPGGESAPKELRGAVAMSSAIFTA 

LGIVMGQWGLSTTAATGLRGLXAGELEELEE 

ERAACQGCRARRPWELFQHRALRRQVTSLV 

VLGSAMELCGNUbV YAYAooVrKKAuvrDA 

KIQYAIIGTGSCELLTAVVSVSLEGALPPPAL 

WGGTPRSFALNQFTLQKKKK 


1171 


2521 


A 


9381 


2 


412 


RGPASAQEDERARTAPLERVRARGRMTTSSA 
LFPSLLPCSWSTSNKYLAJirKAol^ ic. 
TPDKJ^KGLAYAQQTDDSLIHFCWKDRTSGNV 
EDDLIIFPDDCEFKRLl^CPNGRVYVLKPKAG 
SKRLFFWMQEP 


1172 


2522 


A 


9384 


20 


355 


CTHGGKVTIPDPHDMLTTVVHKIK^ 

LQLCAIMISDYLKSSm^KJRLGLFRPTSGLL 

ASFNEVGNTALIVLESY 


1173 


2523 


A 


9393 


430 


87 


QRKEDQ* I1L* YHLNKX)CLH1FMSAJTLYMKI* 
KJFVLFDFNIMFETPFYII*FIFLFSQNLKRJRQV 

IRPPISFSKINNGP 


1174 


2524 


A 


9397 


77 


374 


RLVTCLALLQLLRAFYGIKVKGVRVHRDCGTF 
ESSSTLIR VS * FG VPCNALAHFG VTHF* YILDF 
LGML 


1175 


2525 


A 


9399 


66 


397 


upcci? a np n^MTiTR GSTYTDADPVNKSGGT 
AKiv^'KWSKGKVRDKLNNLVLFDTATYDKL 
CKEVPNYKLITLAVVSERLKIPGSLARAALHE 
LLSRGLI*LVIQH1AQVIY 


1176 


2526 


A 


9408 


2 


299 


t nr thvi ST SISI TVTTl GTTFGKfVIPLLDWY 
GERGY AQNGDF* DAQLDD YSFSCYSHAQVN 
GAPNSLTRAYDDP*VKISGLECQKVGALVEV 
KH NT 


1177 


2527 


A 


9416 


2 


402 


CNFLRSSR1RVHSTPAASTMPPKVDPNEIKVV 
YLRCTGGEVRATSALAPKJGPLGLSSIKVGVD 
FV*ATGDWNVLUSVILTI1ULLSHIFVVPPFFCF 
DHLIAFWDLQSLIFLHVIFSLFITLLLFCFFSEF 


1178 


2528 


A 


9419 


142 


426 


TPLFDLWPRVVLSWLETVLTSLRTRRAASGPP 

ACRIMPTTVDDVLEHGGEVHFLQKQMLYLL 

ALI*DTFAPIYVGrVFLGFTPDHRCRSPGVAEL 


1179 


2529 


A 


9420 


1450 


1655 


LSS AGTKMNLN * KN YWPG ASAHACNPSTLG 

GQSRCITRSGDRDHPG*HGETPSVLKJQKJSRA 

WWRAP 
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/=rw«ihle nucleotide deletion. V=oossible 
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1180 


2530 


A 


9422 


176 


375 


HRFQTTRPD WKPRT*PQGK* GRLSSEISPASPP 
qrf^R<5TKPVPPK ADPPAROKI TGVLHAPLLK 

L j 


1181 


2531 


A 


9436 


2 


274 


PIAASLRMYNLQPYTEENLICTAFATMVETW 
IARTCLDRLTGIPHG YCFVE* AD WATADKCVH 
IYNGKPLPGATPLLSLQLHQLAHLGS 


1182 


2532 


A 


9442 


3 


240 


— \rrtvi^cci^CT\rr CirvrPUrMPQl ^T'nPKPFfrnT 

SMILK*MGAGDEKISAMGKARVDHRELYLGL 
LYPTEDYKLTFRARH 


1183 


2533 


A 


9444 


384 


3 


1 t vnmDU/A? iTTiU/PT EWTTFT T PI VT Ff^FTR 
LKD r Qr W ALrliJ Vv r Lr CLL i r LLr l v Lc^r its. 

KGCSGWAPWLSLQCQHFGRPRWADHLRSGV 

RDQPGQYSKTTFLPKIQKLAGHSGAHL* S*LL 

ERMRWKNRLNPGGRSCSEPRWHHCTPGWAT 

ERG 


1184 


2534 


A 


9462 


391 


655 


LSGFKSLMPKTPLQYIYVRVRTTWSFCLPLDG 
RKLMLo ¥ Y LI K i fNiLrt. I oKM l L,rr\Jiv± v 
IHTCNPSTLGGRAGWIV*AQEFET 


1185 


2535 


A 


9467 


215 


566 


RCPMWQGQASRMDPAKAKJDREASTCCSLA 
WWWGWECWVRALKLSSGPAGPLACWVAK 
KKSLSLSGPVYPSEKGAGLYVF*DRVSLCHPG 
WSAWQr WL I AAbN btr bLLo oW U i tsAsfK 


1186 


2536 


A 


9468 


275 


452 


HIPQLHTKTHYVPTRMVNKI^QIDNSKPWQR 
GG*TGILTHCW*ESKLVQPLWKJVWHYQ 


1187 


2537 


A 


9469 


388 


3 


EVAPGPSQILPRR VTDGGDRPQ1- bLFOrKLi'v^ 
SSRGAEPCLSNCIHSPAPRKQRMGDSDQ*STP 
NPASPHPEAPQEPWDSASGSVGSFSLGRGAK 

a cc>*\ rD^yrDflDD (~\nQTn 1 A PTTT FT FT AT ATsJ 

S 


1188 


2538 


A 


9471 


124 


397 


TMDKKNRHGNSLDMASEIHMTGPMCLlENrr 
GRLMANPEALKILSAJTQPMVEEAIAGLYRAC 


1189 


2539 


A 


9480 


584 


769 


GHVQSQHFGRPRRADHLRSGDRDHPG* HDET 
PSLLKIQKJSWAWWRAPVVPATWEAEAEEW 

R 


1190 


2540 


A 


9483 


463 


86 


VTVGLTLLLRGAPRFTAG*PPSGGGPPLAPLL 

GAS S CRRJRRCNP VL AARJKAG SPRSH STRENC 
RRSRCPDTAHRRRRRGRRRNPSCVRSPRWR 


1191 


2541 


A 


9489 


1 


411 


" t a r\ a t n o a a A Tn A VT? PTi A P AOPQTRRPT 
SVRVCCRAAAASNLLYSSCLQRHSERASEEG 
PPHQI c AJCPP^T VI RGOCSSSNSHSFRRIT*EI 
MAAFVLLSYEQRPLKRPRLGPPDVYPPDPKQ 
KEEELTAVNVK 


1192 


2542 


A 


9497 


389 


161 


VSFLSMSSGHCIRSTRGSKN1VSWSV1AKIQEI* 
rFFnFRKMABFFI AEFMSTYVMMNIHMTVE 
KDTYSDHEEINTS 


1193 


2543 


A 


9509 


186 


1 


J^SQ*KJIWQRSGAMETX1^GWWECKLVQF 
FGKTFVNVN* S*TYVYPCDKULLLGLYPTEM 


1194 


2544 


A 


9512 


58 


433 


PLQRSKCLTLRCLRAKPWAWSQSPRACSSAL 
LKSSRSRASSLNVQCILQSNPQGHQRI*KQKA 
SSKGQQFRR*K£HPFMLKTLNKLRIEGT*LKI 
RRAIYDNPTANIIVEGQKLEAFPLRTGTRQ 


1195 


2545 


A 


9515 


595 


1223 


GHGAPSFQTQVPRTP*ASWPWPAASESAPAP 

AGGGASLPVAAGSCAAAPHTEPGAPQHLLDC 

PCPLCLARPPRRPLPDTCYGPGSGRSASLAEPP 

LPRCSCAPLRSASAPQVS*CV*AVNLLPHNL* 

PLHLLLHD*EKAWGFLFSSASHCFQGQICLLP 

APGSGPCGATARPSRGGRAGGSRARRPIPPGP 

GTRRTPSGCQNPAASGG 
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F=Phenylalanine, OOlycine, H=Histidine, 
l=Isoleucine, K=Lysine, L^Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
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Y=Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 


1196 


2546 


A 


9518 


229 


468 


RSPTATPAPHAMGPGAPFARGGRPLPLLGAM 
AERVAPG WDLHTPYLPRTNSRRTPHL* *EPHA 
GYIGALFPMSGGWPGGQ 


1 197 


2547 


A 


9521 


289 


448 


1AWLSGLFFPSNQANLCFLCYKLTADSRYRG 
HAMRHLTGNTSMAIRFL* ADSRFQ VQRARYE 
APNWKYKYGY* IP VDMLC 


1198 


2548 


A 


9524 


204 


I 


KNKKTTKCLSI VTLNISGPNQ* NKRHRVAE Wl 
VKQEFN1CHL* ETHFPFRDTYRLKEREQKKRK 
SSYS 


1199 


2549 


A 


9546 


1785 


1943 


GGRFKESKLTNAGWQRNSFFIGPPKSIPWAA 
V*QRGDGKNPGVTHLNRPVGTX 


1200 


2550 


A 


9548 


186 


1 


VNAEKEF* KIQHYFMTKSQNKLHIEHTYLKPI 
KAI YDK WTSDIMLNLQKL* AFFLRVTVRQI 


1201 


2551 


A 


9549 


591 


2 


SSV VEFPRGPRSSLPPLDS 1 VPCGSSPNWTGGC 
GSCPSGE*LVSPGSEQRKKYSNSNVIMHETSQ 
YHVQHLATFIMDKSEAITSVDDAIRKLVQLSS 
KEKIWTQEMLLQVNDQSLRLLDLESQEELEDF 
PLPTVQRSQTVLNQLRYPSVLLLVCQDSEQSK 
PDVHFFHCDEVEAELVHEYMESALTDCRLGIC 
AMRP 


1202 


2552 


A 


9-OZ 


AIR 


\ 


KYGNEGHWSRQCPNPGKP1RPCPLCRGPHWK 
LDCERPPQGPLPSLPELAKTSYSDLTGLATED 
♦ WGPGNfDAPATTIASSKTRVTLMV AGRPVFF 
LI # YRATYSALPNFSGPTQSSQVSWGIDGQV 
SKPRATPPLFCSLHTF 


1203 


2553 


A 


9568 


517 


738 


RRKFERKQKQ*RYREGKQYRQRDKMKEWG 
EKEKRRREKGEREERKMRHRERKGESGQRD 
TMENWRVERLTEKER 


1204 


2554 


A 

A 






415 


EDKRLRLVDGDSRCAGRV*IYHDGFWGTICD 
DGWDLSDAHVVCQKLGCGVAFNATVSAHFG 
EGSGPIWLDDLNCTGTESHLWQCPSRGWGQ 
HDCRHKEDAGVICSEFTALR " 


1205 


2555 


A 


9577 


64 


424 


ARG S CPTRPRT ANGRMGETKD APQML VTFK 
DVA VTFFREEWRQLVL VHRTL YR* GMLETC 
GLLDTLRHNVPQPDVVHLLYHGTQLL1VKRE 
VSHSPCAGDMRELFTREATLTPHPYNNGA 


1206 


2556 


A 


9584 


38 


476 


TLGAVLFSEVSKESSTSHSGGQLGRQNRHPKL 
SNFITPSSPRLKP*TASSQRNLGQILNMFLTAV 
NTOPLSTPSWQIETKYSTKVLTGNWMEERRK 
GLP YKHLITHHQEPPHRYLI STYDDHYNRHG 
YNPGLPPLRTWNGQKLLWL 


1207 


2557 


A 


9586 


2 


412 


LRSSPAALLRALC1ITVTOTALALRSRVATTN 
PIXjCRNVLRPKYYRLCDKAESWGIALETVPT 
GVAVTSWA1MLTVLTLVCKGQDYNRRQKLP 
THlLCLL*EKGIFGLTFAFIIGUXjSTGPTRFFL 
FGILFSICFS 


1208 


2558 


A 


9597 


122 


3 


1KNYWPGMVAHACNPSPLGGRGRWIA*AQK 
FADAWADAW 


1209 


2559 


A 


9611 


148 


558 


KSLRNVWDLLNNTWKADRFFCHSSRTSTIRK 
GDPGPTFSKMSrWTSGRTSSSYRHDEKRNIYQ 
RTRDHDLLDKRKTVTALKAGEDRAILLGLAM 
NWCSIMM*FLLGITLLRSYMQSVWTRESQCT 
LLNASITETFNC 


1210 


2560 


A 


9618 


384 


2 


SLHDMLMLAEQQQKQKWAV^NTQNTAWSNA 

DSKFGQRILEKMEWSKGRGLGVQEQGGPDDI 

KVQVKNNDLGLQATINNEANW1AHQDDFNW 

LLAELNTCQRQETADS***WSPKNSHVGKDS 

GELSAK 


1211 


2561 


A 


9620 


316 


610 


QKHPGGGQLGRSPQEDSRFHNKASSGVSRVR 
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M=Methionine, N^Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V- Valine, W-Tryptophan, 
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/=possible nucleotide deletion, \=possible 
nucleotide insertion 














LGRA W WLTP VEPTL WEAKAG G SPE * D* AGRG 
GSRL* SQHFGRPRRVDHLRSAVQDQPGQHGE 
TPSLLKIQKIN*VWGRRL*SSYSEAEAGESL 


1212 


2562 


A 




907 


344 


QFPVEXjDYQKIEKITQLFQAQNLSLCLAMTR 
TREL*KGGGKGRHE*AWPFLKKGGYGVKAP 
AILNTSNCT*CF*ETKMLSDDPKACVFEVSSA 
DL*NTSFGVIR 


1213 


2563 


A 


9624 


2 


356 


"AELSLASTACGRNTSGDSLPDYDRAPISSPLA 
TSGTILSAISCLWDLPTPVLRVGLSCQPSMSSQ 
IPRMYSTDVEAAVNSLEDLYLQAYYAYLCVG 
L YFHRDDMALEG VSRFL* ELAE 


1214 


2564 


A 


9634 


776 


912 


SLSRWVRAKL*VPYNQENCLNPRGGGCSEPR 
SHYCTPAWATEKDS 


1215 


2565 


A 


9636 


220 


426 


" KPGNFA VSSEY*DITSGQLKTAVRG*IEMTST 
EENFGEKLHDIGFGNGFLDKT* KAQ ATKAK1 
DK 


1216 


2566 


A 


9637 


391 


76 


CFLEDGCTQAS*AEEAAVSPSMAEEEQGSTSC 
RERRSIRFKMKNHSPDDnKEKVTISNIRTRKI 
NHLPETERNLLEHGLMYIRLNAAFCSLVAHS 
LFGFELKAT 


1217 


2567 


A 


9655 


2008 


2432 


LHCKMGALETQTHPCSQNMLRSLQKCCCKV 

EEHHLQPVQVLQTLLHSATAGTGCRRPARPP 

PAPPTPTPWRSRQSGKQSERAS*LKGRGRYGL 

GALGGRGGRAJLGGSRWPPPLPGETLFSGCKH 

RRRRRGSDAAPGEEAGT 


1218 


2568 


A 


9658 


3 


405 


HASARALLSPNLSPNNKMAISGGPVLGFFI1A 
VLMSAQEPWAIKEEHVIIQAEFYLNPDQSGEF 
MLDFEGEDTFHGDMAKKETVWRLE*LARLD 
NPE AQRAL ANIAADQ AALE1MDMG SDYTL1P 
NVPPKVTVL 


1219 


2569 


A 


9662 


3 


284 


PD WTEKRKMQDTG S ILPLH WFGFG Y AAL V A 

YGGIIGYVKAGSVPSLAAGLLFGSLSGLGAYQ 

LSQDPRNVWVFLATSGTLAGIMGMRFYHSG 

KL 


1220 


2570 


A 


9669 


200 


699 


"LLLTG YIQTLQNQQLSGNQQEMQAVDNL'1 5A 
PGNTSLCTRDYKITQVLFPLLYTVLFFVGLITN 
GLAMRIFFQIRSKSNFIIFLKNTVISDLLMILTF 
PFKIl^DAKLGTGPLRTFVCQVTSVIFYFTMYI 
SISFLGLITIDRYQKTTRPFKTSNPKNLLGAKIL 

K 


1221 


2571 


A 


9676 


164 


562 


K£RDSSTFSAAMTTMQGMEQAMPGAGPGVF 
QLGNMAVIHSHLWKGLQEKFLKGEPKVLGV 
VQILTALMSLSMGnMMCMASNTYGSNPISV 
YIGYTIWGSVMFUSGSLSIAAGIRTTKGLVRG 
SLGMNITSS 


1222 


2572 


A 


9688 


43 


412 


" " VAKMVKCCSAIGCASRCLPNSKLKGLTFHVF 
PTDENIKJ^WVLAMKRIDVNAAGIWEPKKG 
DVLCSRHFKKTDFDRSAPNIKLKPGVIPSIFDS 
PYHLQGKREKLHCRKuVFTLKTVPATNYNH 


1223 


2573 


A 


9696 


308 


564 


" RTSMGILYSEPICQAAYQNDFGQVWRWVKii 
DSSYANVQDGFNGDTPLICACRRGHVRIVSFL 
LKJCECLCQPQKPERENLLALCCE 


1224 


2574 


A 

i 


9700 


3 


632 


DAWASGGELGSLFDHHVQRAVCDTRAKYKh 

GRRPRAVKVYTINLESQYLLIQGVPAVGVMK 

ELVERFALYGAIEQYNALDEYPAEDFTEVYLI 

KFMNLQSARTAKRKMDEQSFFGGLLHVCYA 

PEFETVEETRKXLQMRKAYVVKTTENKDHY 

VTKKKLVTEHKDTEDFRQDFHSEMSGFCKA 

ALNTSAGNSNPYLPYSCELPLCYFSSK 
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Amino acid sequence (A=Alanine OCysteine, 
D=Aspartic Acid, E=01utamic Acid, 
F=Phenylalanine, OGlycine, H=Histidine, 
Mso leucine, K=Lysine, L=Leucine, 
M=Methionine, N-Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S^Serine, 
T-Threonine, V-Valine, W=Tryptophan, 
Y-Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, Y=possible 
nucleotide insertion 


1225 


2575 


A 


9710 


1 


163 


RSGCVLRMTEWETGAPAVAETPDIKLFGKWS 
TDDVHINDISLQDYIAGVRLELL 


1226 


2576 


A 


9713 


82 


492 


QGLPSFLPAFGPSGSWLGPAPTLGSSCNTVDT 

1CHGYSEIRPLFYLSFCDLLLGLCWLTETLLYG 

ASVANKDUCYNLQAVGQIFYISSFLYTVNY1 

WYLYTELRMKHTQSGQSTSPLVTDYTCRVCQ 

MAFVFSSLI 


1227 


2577 


A 


9720 


3 


416 


GKWKRTQVPLLGEECADMDLARKEFLRGNG 
LAAGKMNIS1DLDTNYAELVLNVGRVTLGEN 
NRKKMKDCQLRKQQNENVSRAVCALLNSGG 
GV1KAEVENKGYSYKKDGIGLDLENSFSNML 
PFVPNFLDFMQNGNYF 


1228 


2578 


A 


9723 


278 


411 


EASSSNTVASNVADKTDPHSMNSRVFIGNLN 
TLVLQKSDVEAVF 


1229 


2579 


A 


9725 


121 


902 


LFAMSGFENLNTDFYQTSYSIDDQSQQS YD Y 

GGSGGPYSKQYAGYDYSQQGRFVPPDMMQP 

QQPYTGQIYQPTQAYTPASPQPFYGNNFEDEP 

PLLEELGINFDHIWQKTLTVLHPLKVADGSIM 

NETDLAGPMVFCLAFGATLLLAGKIQFGYVY 

GISA1GCLGMFCLLNLMSMTGVSFGCVASVL 

GYCLLPMILLSSFAVIFSLQGMVGIILTAGIIG 

WCSFSASKIFISALAMEGQQLLVAYPCAIXYG 

VFALISVF . 


1230 


2580 


A 


9739 


11 


247 


TF VLNMNTPKEEFQD WPIVR1AAHLPDLI V Y G 
HFSPERPFMDYFDGVLMFVDISGKCKRDVCL 
MWMSNRLAWEFTCRA 


1231 


2581 


A 


9744 


37 


1100 


TPLFDFWPGFVLSWLQPLSASLRARRAASGPP 

ACRIMPTTVDDVLEHGGEFHFFQKQMFFLLA 

LLSATFAPIYVGIVFLGFTPDHRCRSPGVAELS 

LRCGWSPAEELNYTVPGPGPAGEASPRQCRR 

YEVDWNQSTFDCVDPLASLDTNRSRLPLGPC 

RDGWVYETPGSSTVTEFNLVCANSWMLDLFQ 

S S VN VGFFIGSMSIG YIADRFGRKLCLLTTVLI 

N AAAG VLMAI SPT YTWMLIFRLIQGL VSKAG 

WLIGYILITEFVGRRYRRTVGIFYQVAYTVGL 

LVLAGVAYALPHWRWLQFTVALPNFFFLLY 

YWCIPESPRWLISQNKNAEAMRIIKHIAKKNG 

KSLPASL 


1232 


2582 


A 


9753 


164 


517 


PGPGMQGPPPITPTSWSLPPWRAYVAAAVLU 
YINLLNYMNWFI1AGVLLDIQEVFQ1SDNHAG 
LLQTVFVSCLLLSAPVFGYLGDRHSRKATMS 
FGILLWSGAGLSSSFISPRYSWLF 


1233 


2583 


A 


9757 


25 


419 


LPAPWTERVRKSEGLVGTCLGDPMASPRI V l 
IVALSVALGLFFVFMGTIKLTPRLSKDAYSEM 
KRAYKSYVRALPLLKKMGrNSILLRKSIGALE 
VACGIVMTLVPGRPKDVANFFLLLLVLAVLF 
FHQLV 


1234 


2584 


A 


9765 


71 


456 


RLELDWGFSLHFLPVAYLCPLSSGFEMNVQP 
CSRCGYGVYPAEKISCIDQIWHKACFHCEVC 
KMMLSVTWFVSHQKKPYCHAHNPKJ^NTFFS 
VYHTPLNLNVRTFPEAISGIHDQEDGEQCKSV 

FHWD 


1235 
1236 


2585 
| 2586 


A 
A 


9767 
9770 


52 
352 


559 
608 


TO SHAMS VDKAELCGSLLTWLOTFHVPSPCA 

SPQDLSSGLAVAYVLNQIDPSWFNEAWLQGI 

SEDPGPNWKLKVTSGLLIRGQTGEEMTRDGP 

ARHMSVA^MGRKRDRCLVTNHLFIHSSMEYSP 

CARPGHSARKOTDKNLPHTAIILVTSNTYTTI 

KJNFQAGRSGSCL _^ 
FRGEAi^TVRFLTKRnGEYASNFESIYl'U^KLC 
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SEQID 
NO: of 
nucl- 
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seq- 
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SEQID 
NO: of 
peptide 
scq- 
uence 


Met ! 
hod 


SEQ | 
ID NO: ; 
in 
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beginning 

nucleotide 

location 

correspond] 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A^Alanine OCysteinc, 
D=Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, OGlycUie, H=Histidine, 
l=Isoleucine, K=Lysine, LHLeucine, 
M=Methionine, N=Asparagine, P=Pro(ine, 
Q=Glutamine, R^Arginine, S=Serine, 
T=Threonmc, V=V aline, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *= 1 Stop codon, 
/=posstble nucleotide deletion, V=possible 
nucleotide insertion 














LERKQLNLEIYDPCSQTQKAKFSLTSELHWA 
DGFVIVYDISDRSSFAFAKAL1 


1237 


2587 


A j 


7 17 J 


266 


515 


NILAHYFPFPRLFLLRDSQSNPKAFALTLCHH 
QKJKNFQILPVSIDALTPPLVVCFLVSFLTHFS 
RYKPTRP VCITQFQGC S 


1238 


2588 


A 


9802 


537 


967 


ELGAGRSDREAMEAAVKEEISVEUEAVDKNI 

FRDCNK1AFYRRQKQWLSKKSTYRALLDSVT 

TDEDSTRFQIINEASKVPLLAEIYGIEGNIFRLK 

INEETPLKPRFEVPDVLTSKPSTVRLISCSGDT 

GSLILADGKGDLKC 


1239 


2589 


A 


9805 


105 


540 


VPGDPAMVRAGAVGAHLPASGLDIFXiULKK 

MNKRQLYYQVLNFAMIVSSALMIWKGLIVLT 

GSESPIVVVLSGSMEPAFHRGDLLFLTNFRED 

PIRAGEIWFKVEGRDIPIVHRVIKVHEKDNG 

DDCFLTKGDNNEGDDRGSYK 


1240 


2590 


A 


9819 


3 




TDGRDPLPCAARRRGGGGECCGAGWVAfcWS 
PQPLDPAMLLV/MQGFVLEAVACQDNDDYLR 
YGILFEDLDCNGDGWDI1ELQEGLRNWSSAF 

DPNSEEHG 


1241 


2591 


A 


9834 


841 


1209 


SPARGKSNRTDVMITAPKNKKMTENLAAPEA 
LDSSTHSSSTATQSRAJCMNTPAPTPSTVPAIPR 
GGSGGPPPCAPHDRVSSVLQCDTQAMDHKTE 
S SKS WEFLFKRTKTPS PFHP A VRENRN 


1242 


2592 


A 


9843 


3 


589 


"TISCGPATEPPASLLSSASSDDFCKEKTEDRYS 
LGSSLDSGMRTPLCRICFQGPEQGELLSPCRC 
DGS VKCTHQPCLDCWISERGC W SCELCY YKY 
HV1AISTKNPLQWQAISLTVIEKVQVAAA1LGS 
LFLIASISWLIWSTFSPSARWQRQDLLFQICYG 
MYGFMDVMIVAVDSEDMVQAAKEVGKRWS 

D1PP 


1243 


2593 


A 


9846 


198 


411 


WRISHHAGKMPVMKGLLAPQNTFLU'l'lA'l Kl- 
DGTHSNFILANAQVAKGFPIVYCSDGFCELAG 

FARTEVMQ 


1244 


2594 


A 


9848 


116 


650 


PICGFLYLCSAMASESSPLLAYRLLGEEGVAL 

PANGAGGPGGASARKLSTFLGVWPTVLSMF 

SIWFLRIGFWGHAGLLQALAMLLVAYFILA 

LTVLSVCA1ATNGAVQGGGAYCILQHRWTG 

VWPVLPAREVMISRTLGPEVGGSIGLMFYLA 

NVCGCAVSLLGLVESVLDVFGA 


1245 


2595 


A 


9849 


573 


1620 


"KSKCRFPEGLSEGFGPMRKEALSSGSVQEAE 
AMLDEPQEQ AEGSLTVYVI SEHSSLLPQDMM 
SYIGPKRTAVWGIMHREAFN11GRRIVQVAQ 
AMSLTEDVLAAALADHLPEDKWSAEKRRPL 
KSSLGYEITFSLLNPDPKSHDVYWDIEGAVRR 
YVQPFLNALGAAGNFSVDSQILYYAMLGVNP 
RFDS ASSS YYLDMHSLPHVINP VESRLGS SAA 
SLYPVLNFLLYVPELAHSPLYIQDKDGAPVAT 
NAFHSPRWGGIMVYNVDSKTYNASVLPVRV 
EVDMVRVMEVFLAQLRLLFGIAQPQLPPKCL 
LSGPTSEGLMTWELDRLLWARSVENLATATT 

TLTSLA 


1246 


2596 


A 


9850 


114 


464 


PPQLGAQRVREPRHPDVRAPLRVTSPGLRSKb 
ARSLGRRPRi AM Vj v ON i LtAtur v urAw jvi 
QDGLSPCFFFTLVPSTRMALGTLALVLALPCK 
RRERPAGADSLSWGAGPRJSSYV 


1247 


2597 


A 


9851 


2 


327 


F VRNKKMTRS C S A VGCSTRDTVLSRERGLSF 
HQFPTDTIQRSKW1RAVNRVDPRSKKJWIPGP 
GAILCSKHFQESDFESYGIRRiCLKKGAVPSVS 
LYKVFKYSSRCTS 
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seq- 
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SEQID 
NO: of 
peptide 
seq- 
uence 


Met ! 
hod 


SEQ ! 
ID NO: | 
in | 
USSN I 
09/496 
914 


Predicted 
beginning 
nucleotide 
location 
correspond i 
ng to first 
amino acid 
residue of 
peptide 
sequence 


Predicted end 
nucleotide 

corresponding 
to last amino 
acid residue 
of peptide 


Amino acid sequence (A«=Alanine C=Cysteinc, 
D=Aspartic Acid, E=Glutamic Acid, 
F=Phenyl alanine, G=Glycine t H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V-Valine, W-Tryptophan, 
Y«Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 


1248 


2598 


A 


9853 


58 


444 


RVDDFVYSKGGKDAGGADVSLACRRQSIPEE 

FRG1TVVELIKKEGSTLGLTISGGTDKDGKPR 

VSNLRPGGLAARSDLLNIGDYIRSVNGIHLTR 

LRHDEIITLLKNVGERWLEVEYELPPPGGCP 

WT 


1249 


2599 


A 


9856 


2 


1265 


LPPPRPSRHRRGRAGTRASAAAAAGPTVSAV 

RAPVRGQDSGAGTPQGRLAGRGAHLSRVGA 

SGSGVAAGPAARHAPRRRCADAGEAVGASC 

GRCAVALLSGVCTLVSTHVCV GSGCPGAAGT 

PMGAGDAGASAESAVTTAPQEPPARPLQAGS 

GAGPAPGRAMRSTTLLALLALVLLYLVSGAL 

SDQELGLLIKEVADALGGGADPETNSTSNSSH 

SAWDLGSAFFFSGTIITTIGGGGDWHVGGGK 

ELPHGGRCRETEGSQVAPRLPASPLCPGYGN 

VALRTDAGRLFCIFYALVGIPLFGILLAGVGD 

RLGSSLRHGIGHIEAIFLKWHVPPELVRVLSA 

MLFLUGCLLFVLTPTFVFCYMEDWSKLEAIY 

FVIVTLTTVGFGDYVA 


1250 


2600 


A 


9873 


2 


652 


c\ r\ menr*/^ n.n>/iT> A T>Xtn a CPPT'K/ffTWQ A^I?"WT>F 
r V Vr orUOvjlr UKAr INU Aolu 1 JYHjrN o/\otsJNi^r 

EWVYTDQPHTQRRKEILAKYPAIKALMRPDP 

RLKWAVLVLVLVQMLACWLVRGLAWRWLL 

FWAYAFGGCVNHSLTLAIHDISHNAAFGTGR 

AARNRWLAVFANLPEGVPYAASFKKYHVDH 

HRYLGGDGLDVDVPTRLEGWFFCTPARKLL 

WLVLQPFFYSLRPLCVHPKAVTRMEVLNTLV 

QLA 


1251 


2601 


A 


9875 


150 


1209 


PVIMPLHFSPGDIVRPSCCVSSSPKLRRNAHSR 
LESYRPDTDLSREDTGCNLQHISDRENIDDLN 
MEFNPSDHPRASTIFLSKSQTDVREKRKSLFIN 
uuDorj^i a d vvcqpqttft nn^TV^OPMT KYTT 

Ijjijj Ov^lArviS. I DoLj 1 LrLjLJUa 1 V oyiiiLN III 

KCVALAIYYHIKNRDPDGRMLLDIFDENLHPL 
SKSEVPPDYDKHNPEQKQIYRFVRTLFSAAQL 
TAECATVTLVYLERLLTYAEIDICPANWKRIV 
t fiATT I A^xvwnnOAVWNVDYCOILKDlTVE 
DMNELERQFLELLQFNINVPSSVYAKYYFDL 
RSLAEANNLSFPLEPLSRERAHKLEAJ SRLCED 
KYKDI RRSARKRSASADNLTLPRWSPAIIS 


1252 


2602 


A 


9879 


6 


376 


KRPDSRPPAQYRAGPTRPRTRGCELLYWKAT 
KAVGIKMGSLSTANVEFCLDVFKELNSNNIG 
DNIFFSSLSLLYALSMVLLGARGETEEQLEKV 
WNSSEVCSEPRSLSCSRSGSAKL1LSLYQ 


1253 


2603 


A 


9880 




JOO 


KEQAELLYGLYCQCDLTLSSFIPSSVPAMSSC 

NFTHATFVLIGIPGLEKAHFWVGFPLLSMYVA 

AMFGNC 


1254 


2604 


A 


9881 


19 


494 


VISFQIITDTIMDSSTAHSPVFLVFPPEITASEYE 

STELSATTFSTQSPLQFXFARKMKILGTIQILF 

GIMTFSFGVIFLFTLLKPYPRFPFIFLSGYPFWG 

SVLFINSGAJFLIAVKRKTTETLIILSRIMNFI^A 

LGAIAGDLLTFEFHPRSKLHL 


1255 


2605 


A 


9896 


// 




RPGREQRDCFQ APPLGLGGRQTDMMHHPLT 
GATCVGLPNVGMCPQLSGALTFMYLQQGNQ 
EATVAPDTMAQPYASAQFAJPPQNGIPGEYTA 
PHPHPAPEYTGQTT 


1256 


2606 


A 


9902 


95 


399 


SGGPAGLLHRPVLPKMGLSGLLPILVPFILLG 
DIQEPGHAEGILGKPCPKDCVECEVEEIDQCTK 
PRDCPENMKCCPFSRGKKCLDFRKVSLTLYH 
KEELE I 


1257 


2607 


A 


9905 


374 


459 


EHLKSTPNRLGWAHTCNPSTLGGRGGW 
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SEQID : 
NO: of 1 

DUCl- | 

eotide 
seq- 
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3EQID I 
^0: of 1 
peptide 
seq- 
uence 


Viet < 
lod 1 


SEQ I 

J) NO: \ 
n 

USSN 
09/496 
914 


Predicted 

beginning 

mcleotide 

ocation 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 

corresponding 
to last amino 
acid residue 
of peptide 
sequence. 


Amino acid sequence (A-Alanine C -Cysteine, 
>Aspartic Acid, E=Glutamic Acid, 
F=Phenyla!anine, GOlycine, H=Histidine, 
l=Isoleucine, K^Lysine, L^Leucine, 
M=Methionine, N=Asparagine, P==Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T-Threonine, V-Valine, W=Tryptophan, 
Y-Tyrosine, XHJnknown, *=Stop codon, 
/=possibIe nucleotide deletion, \=possible 
nucleotide insertion 


1258 


2608 


A 


9911 


364 


1974 


AGPGVPAVGGRWASGPGLGGRTLCSGPFUH 

QRRGPSCGASGDPQCVGSPHPQRARPLLARP 

GARLLPGHLPSPRPPRLPTGQPPAAAFRGPVR 

POGGGHIHPLPTPGGRPCFAVSEGSGSALLLS 

YLGECGSSSYVTGAACISPVLRCREWFEAGLP 

WPYERGFLLHQK1ALSRYATALEDTVDTSRL 

FRSRSLREFEEALFCHTKSFPISWDAYWDRND 

PLRDVDEAAVPVLCICSADDPVCGPPDHTLTT 

ELFHSNPYFFLLLSRHGGHCGFLRQEPLPAWS 

HEVILESFRALTEFFRTEER1KGLSRHRASFLG 

CM R RPiG ALORREVSSSSNLEEIFNWKRSYTRL 

MAAAAGAAAAPGSREPQDRPECGAGHPGPR 

YYRHPERWLLRPE AFLGPLRTRAPS AED SQR 

FRPAARSGPEMRVRYPWAAVLAPYLALSQD 

PMVKSSASGQGASGSYNHVREEMLIKAGGA 

MSRRVVRQSKFTlHVFGQAAKADQAYEDrRV 

SKVTWDSSFCAVNPKFLAIIVEAGGGGAFIVL 

PI AK 


1259 


2609 


A 


9919 


693 


935 


GCFKFIGESTCCWIFPSSVTTQCWAKAPKAA 
TLSKAERLRSQPGPEQGGSSYRPRTPTAAAIL 
PPRPGRSHRKRKL V STK 


1260 


2610 


A 


9921 


455 


1082 


^QRSCLCSAIEKDGGDVKALYRRSQALEKXUK 
LDOAVLDLQRCVSLEPKNKVFQEALRNIGGQ 
10EKVRYMSSTDAKVEQMFQILLDPEEKGTE 
KXOKASQNLWLAREDAGAEKIFRSNGVQLL 
ORT I DMGETDLMLAALRTLVGICSEHQSRTV 
ATLSILGTRRWSILGVESQAVSLAACHLLQV 
MFDALKEGVKKGFRGKEGAIIV 


1261 


2611 


A 


9928 




438 


GFRGAEAPGAAQAPKKKKPRP1EGGPOAOSO 
RGKDP YRGPTLLHQPKPPKDEFLS SLESYE1 AF 
PTRVDHNGALLAFSPPPPQRQRRGTGATAES 
RLFYKEASPSTHFLLNLTRSSRLLAGHVSVEY 
WTREGLAWORADRPHCLYA 


1262 


2612 


A 


9931 


168 


435 


AAEMGRAGAAAV1PGLALLWAVGLGGPPPA 
PPRLPFCLQELQGRHALHTFSLERTCSYQDFL 
WADEGRLLHVGAQDLATWHTLSPLGLW 


1263 


2613 


A 


9938 


247 


488 


RMS ATSVDQRPKGQGNK V SVQNGSIHQKlXi 
CNDDDFEPYLRSPDNQSNSYPPMSDPYMPGY 
YAPSIGFPYSLGEAAWSQL 


1264 


2614 


A 


9941 


61 


277 


" ESIGLTAIGPRRRPWEHRNVSDPITLRMKOWU 
WLALLLGALLGTAWARRSQDLHCGACKAVR 
RRVRQFNIYDY 


1265 


2615 


A 


9956 


2 


J LI 


" FVASEVSKMPVPASWPHPPGPFLLL1XLLOL1 
EVAGEEELQM1QPEKLLLVTVGKTATLHCTV 
TSLLPVGPVLWFRGVGPGRELIYNQKEGHFP 
RVTTVSDLTKRNNMDFS1RISSITPADVGTYY 
CVKFRKGSPDHVEFKSGAGTELSVRGEYSVG 
FLSOVWWWLSSHPFMN 


1266 


2616 


A 


10002 


243 


387 


PKNNACHLLFTAVCQPRCKHGEC1GPNK.l;kU 
HPGYAGKTCNQGRKTV 


1267 


2617 


A 


10004 


36 


707 


LPAPASTWSVARETMASSSVPPATVSAA I AU 
PGPGFGFASKTKKKHFVQQKVKVFRAADPLV 
u7p\m ucTxrPT QnVPPPVK/TT I PDDFKAS 
SKIK VNWLLFHRENLP SHFKFKE Y CPQ VFRNL 
RDRFGIDDQDYLVSLTRNPPSESEGSDGRFLIS 
YDRTLVIKEVSSEDIADMHSNLSNYHQVRPLS 
SP1LSLSSLLTYSSAIVSNRCQLGRKLIGRENP 


1268 


2618 


A 


10005 


2 


209 ■ 


GEGYELFVPSNGVPAVCHMVGRRPHRAVLbr 
SQDELEHSLGESAAQGAAGWL WVS WENTR 
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?14 


Predicted 1 

beginning 

lucleotide 

ocation 

correspond) 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
ocation 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alamne C=Cysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, G^Glycine, H=Histidine, 
^Isoleucine, K-Lysine, l>— i^cucmc, 
M^Methionine, N=Asparagine, P=Proline, 
Q==Glutarnine, R«Arginine, S=Serinc, 
Threonine, V-Valine, W -Tryptophan, 
Y=Tyrosine, X=Unknowru *=Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 
TKVSLGLA 




1269 


2619 


A 


10010 


245 


688 


FGNlLl^GHSSKKDNLAVNAVALQDrllLHlJ 
LQLRNl^VADHSKTQVQKKENKSLKRDTKAI 

idtglkkttqcpkledsekeyvldpkpppltl 
aqklgligppppplssdewekvkqrsllqgds 
vopcpickeefelrpqvfsirg 




1270 


2620 


A 


10011 


2 


588 


rvddfvrplppglmsrsrasihrgsipamsya 

PFRDVRGPSTHRTQYVHSPYDRPGWNPRFC1I 

sgnqllmldedeihpllirdrrsessrnkllr 

RTVSVPVEGRPHGEHEYHLGRSRRKSVPGGK 

qysmegapaapfrpsqgflsrrlkssikrtks 
qpkldrtssfrqilprfrsadhdryrgwsmw 

prinv 


1271 


2621 


A 


10013 


209 


363 


LPAPPNLSPRLSFGFQFPGGNDNYLTII CjKSrir 
FLSGAEVSQSCRRRGGRA 


1272 


2622 


A 


10014 


7 


388 


SAVTISWKWRSVMGlQ'l^PALLASLUAULV l 
LLGLAVGSYLVRRSRRPQVTLLDFNEKDLLR 
LIDKTLSARSPCKLHIYLSTRIDGSLSIRPYTPVT 
SDEDOGYVDIDIKVYLKGVHPTFPEGGKMSH 


1273 


2623 


A 


10016 


1 


1339 


M AARTLGRG VGRLLGSLRGLSGQPAKPPUU v 

SAPRRAASGPSGSAPAVAAAAAQPGSYPALS 

AQAAREPAAFWGPLARDTLVWDTPYHTVW 

DCDFSTGKIGWFLGGQLNVSVNCLDQHVRKS 

PESVALIV^RDEPGTEVRITYRELLETTCRLA 

NTLKRHGVHRGDRVAIYMPVSPLAVAAMLA 

CARIGAVHTV1FAGFSAESLAGRINDAKCKW 

ITFNOGLRGGRWELKKIVDEAVKHCPTVQH 

VLVAHRTDNKVHMGDLDVPLEQEMAKEDP 

VCAPESMG SEDMLFMLYTSGSTGMPKGI VHT 

QAGYLLYAALTHKLVFDHQPGDIFGCVADIG 

WITGHSYVVYGPLCNGATSVLFESTPVYFNA 

GRYWETVERLKINQFYGAPTAVRLLLKYGD 

AWVKKYDRSSLRTLGSVGEPINCEAWEWLH 

R WGDSRCTLVDTWW QT 


1274 


2624 


A 


10017 


1 


3750 


" FRPCXjTPRSPASHVLTMSAPDhUKKUrrKrrxu 
KTLGSFFGSLPGFSSARNLVANAHSSARARPA 
ADPTG AP AAE AAQPQ AQ V AAHPEQT AP WTE 
KELQPSEKMVSG AKDLVCSKMSRAKD AVS S 
GVASVVDVAKGWQGGLDTTRSALTGTKEV 
VSSGVTGAMDMAKGAVQGGLDTSKAVLTG 
TKDTV STGLTG A VNV AKGTVQAGVDTTKTV 
LTGTKDTVTTGVMGAVNLAKGTVQTGVETS 
KAVLTGTKDAVSTGLTGAVNVARGSIQTGV 
DTSKTVLTGTKDTVCSG VTGAMNV AJCGTIQT 
GVDTSKTVLTGTKDTVCSGVTGAMNVAKGT 
IQTGVDTSKTVLTGTKDTVCSGVTGAMNVA 
KGTIQTGVDTTKTVLTGTKNTVCSGVTGAVN 
LAKEA1QGGLDTTKSMVMGTKDTMSTGLTG 
AANVAKGAMQTGLNTTQNIATGTKDTVCSG 
VTGAMNLARGTIQTGVDTTKIVLTGTKDTVC 
SGVTGAANVAKGAVQGGLDTTKSVLTGTKD 
AVSTGLTGAVNVAKGTVQTGVDTTKTVLTG 

TKDTVCSGVTdAYN v AJ\.uA v vuui-a;i 

V1GTKDTMSTGLTGAANVAKGAVQTGVDTA 

KTVLTGT7CDTVTTGLVGAVNVAKGTVQTGM 

DTTKTVLTGTKDTIYSGVTSAVNVAKGAVQT 

GLKTTQN1ATGTKNTFGSGVTSAVNVAKGAA 

QTGVDTAKTVLTGTKDTVTTGLMGAVNVAK 

GTVQTSVDTTKTVLTGTKDTVCSGVTGAAN 
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SEQID S 
NO: of > 
nucl- P 
cotidc s 
scq- i 
uence 


EQIP N 
40: of h 
cptidc 
eq- 
icnce 


let 1 S 
od I 
i 
I 
C 

c 


E.Q 1 f 
DNO: 1 b 
n J r 
JSSN 1 1 
>9/496 t 
)14 i 


redicted 1 J 
beginning r 
,ucl cotidc 1 
ocation < 
xjrrespondi t 
ig to first J 
iraino acid 
residue of 
peptide 
sequence 


Redicted end l 
mcleotide 1 
ocation 1 
;orresponding 
o last amino 
icid residue 
of peptide 
sequence 


\mmo acid sequence (A-Aianme C-Cysteme, 
>Aspartic Acid, E=Glutamic Acid, 
--Phenylalanine, G~Glycine, H»Histidinc, 
-lsoleucinc, K-Lysinc, L=Leucine, 
^Methionine, N-Asparagine, P=Prohne. 
3=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V-Valinc, W^Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \-possible 
nucleotide insertion 














VAKG AIQGGLDTTKS V LTGTKD AVS'l UL l UA 

VKLAXGTVQTGMDTTKT\TTGTKDAVCSGV 

TGAANVAKGAVQMGVDTAKTVLTGTKDTV 

CSGVTGAANVAKGAVQTGLKTTQN1ATGTK 

NTLGSGVTGAAKVAKGAVQGGLDTTKSVLT 

GTKDA VSTGLTGA VNL AKGTVQTG VDTSKT 

VLTGTKDTVCSGVTGAVNVAKGTVQTGVDT 

AKTVLSGAKDAVTTGVTGAVNV AKGTVQTG 

VD ASKA VLMGTKDTVF SG VTG AMSMAKG A 

VOGGLDTTKTVLTGTKDAVSAGLMGSGNVA 

TGATHTGLSTFQNWLPSTPATSWGGLTSSRT 

TDNGGEQTALSPQEAPFSGISTPPDVLSVGPEP 

AWEAAATTK.GLATDVATFTQGAAPGREDTG 

LLATTHGPEEAPRLAMLQNELEGLGDIFHPM 

NAEEQAQLAASQPGPKVLSAEQGSYFVRLGD 

LGPSFRQRAFEHAVSHLQHGQFQARDTLAQL 

QPCFFT 


1275 


2625 


A 


10025 


124 


415 


TILARKKEKTCPCKKEIGRNSRSGM Y SKAAM 
YKRKYSAANTKVEKKKKEKVLAPVTKPVGG 
DKNGGTRVVKLPTMPRYYPTEDVPRKLLSHG 

KKPFS 


1276 


2626 


A 


10030 


3 


507 


GGSLRFSPPRVPSCSRVFCPVPPUOUULrsnva^ 
ASRPQSPTTPWCLPRRYMKHKRDDGPEKQED 
EAVDVTPVMTCVFVVMCCSMLVLLYYFYDL 
LVYWIGIFCLASATGLYSCLAPCVRJRLPFGK 
CR1PNNSLPYFHKRPQARMLLLALFCVAVSV 


1277 


2627 


A 


10035 


51 


869 


"YSRFTVPLPATMASSb V AKHLLFQSHMA L K l 
TCMSSQGSDPEQIKRENIRSLTMSGHVGFESL 
PDQLVNRSIQQGFCFN1LCVGETGIGKSTLIDT 
LFNTNFEDYESSHFCPNVKLKAQTYELQESN 
VQLKLTIVNTVGFGDQrNKEERQLGRSQSTEN 
PQKYRSEQHPVEPKKCTSFWKGALGKWAGIE 
SSGQSAQQPYLPINSPPHRLADVADVHLFSSV 
LSGAFGCYHLDVTVNEFKKQQNRDEQEGYS 
rn pn F nr^AVKHnADPLRGGEM 


1278 


2628 


A 


1UU30 




457 


" RAFDVRRKKSLRPCCPRDFHAGCLTVbCihSf 
VMGAVGESLSVQCRYEEKYKTFNKYWCRQP 
CLPIWHEMVETGGSEGWRSDQVIITPHPGDL 
TFTVTLENLTADDAGKYRCGIAmQEDGLSG 
Fl PnPFFOVOVLVSSASSTENSVKTP 


1279 


2629 


A 


10039 


214 


435 


" NDSLVPMSSWRSCARAPSSESAWRK5>AA1KK 
SRKCLRTKRKRWSSGKGTQMQSTLSETPRRA 
OMPCMWWYPFWG 


1280 


2630 


A 


10043 


2 


344 


" ' RATWHNAGKEREAVQLMAGAEKRVICASHb 
FLRGLFGGNTRIEEACEMYTRAANMFKMAK 
NWSAAGNAFCQAAKLHMQLQSKHDSATSFV 
DAGNAYKKADPQGKTARHVACYLCV 


1281 


j 2631 


A 


10080 


620 


818 


VIYKLDSSLFSYFIYFFIFETESHFLPLMKW to 
PTMAHCSLKILASRNSADSAFLSAGDTSLSHST 


1282 


2632 


A 


10084 


3 


1640 


SASlllRGDKJIASGEVGlAPbSRHlLlGEPSAKY 

NGTAIISLVRGPG1LGEVTVFWRIFPPSVGEFA 

ETSGKLTMRDEQSAVIVVIQALNDDIPEEKSF 

YEFQLTAVSEGGVLSESSSTANITWASDSPY 

GRFAFSHEQLRVSEAQRVNmiRSSGDFGHVR 

LWYKTMSGTAEAGLDFVPAAGELLFEAGEM 

RKSLHVEILDDDYPEGPEEFSLTITKVELQGR 

GYDFTIQENGLQIDQPPElGNlSrVRIIIMKNDN 

AEGIIEFDPKYTA7EVEEDVGLIMIPWRLHGT 
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SEQ ID | S 
NO: of 1 J 
nucl- I F 
cotide s 
seq- * 
uence 


EQID N 
JO: of h 
>eptide 
seq- 
uence 


let 1 5 

od j r 

i I 
( 

( 


EQ I 

D NO: t 
n i 
JSSN 1 
)9/496 ( 

m 


Predicted 1 
jeginning i 
lucleotide 
ocation 
:orrespondi 
ag to first 
amino acid 
residue of 
peptide 
sequence 


Predicted end 
lucleoude 
ocation 
;orresponding 
to last amino 
acid residue 
of peptide 
sequence 


\romo acid sequence (A=Alanine C-Cysteinc, 
^Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
l=lsoleucine, K=Lysine, L=Leucune, 
Vl=Methionine, N=Asparagine, P=Proline, 
Q=Glutam'uie, R-Axginine, S*=Serine, 
T=Threonine, V-Valinc, W=Tryptophan, 
Y=Tyrosine, X=UnJcnown, *=Stop codon, 
/^possible nucleotide deletion, V=possible 
nucleotide insertion 














YGYVTADFISQSSSASPCKjVDYILHGSTVTKJ 
HGQNLSFIN1SIIDDNESEFEEPIEILLTGATGG 
AVLGRHLVSRII1AKSDSPFGVIRFLNQSK1SIA 
NPNSTMILSLVLERTGGLLGEIQVNWETVGFN 
SQEALLPQNRD1ADPVSGLFYFGEGEGGVRTII 
LTIYPHEEIEVEETFIIKLHLVKGEAKLDSRAK 
D VTLTI Q EF GDPN GW Q F APETL SKKTY SEPL 
PHPT I TTFFVRRVKGTFGE[M 


1283 


2633 


A 


10088 


316 


516 j 


MGSKTLP APVPIHPSLQLTNYSFLQ AVN ULV 1 
VPSDHLFNLYGFSALHAVHLHQWTLGYPAM 

HLXRS 


1284 


2634 


A 


10091 


2 


569 


FVSPSRAMASALlYVSKFKSFVILWTPLLLLi' 

LVILMPAKFVRCAYVIILMAIYWCTEVIPLAV 

TSLMPVLLFPLFQILDSRQVCVQYMKDTNML 

FLGGLIVAVAVERWNLHKRIALRTLLWVGA 

KP ARLMLGFMG VTALL SM WI SNT ATT AMMV 

PIVEAILQQMEATSAATEAGLELVDKGKAKE 

T P 


1285 


2635 


A 


10092 


290 


728 


" KQSTRPDVMTL YPLHWQEEMSGES Y V55A v r 
AAATRTTSFKGTSPSSKYVKLNVGGALYYTT 
MQTLTKQDTMLKAMFSGRMEVLTDSEGWIL 
IDRCGKHFGTILNYLRDGAVPLPESRRE1EELL 
AEAKYYLVOGLVEECQAALOV 


1286 


2636 


A 


10100 


1 


574 


RPRGRGAWAGPGGDYSGVRRQQRRRTKJSUb 

QRGSDAAGTMGCCTGRCSLICLCALQLVSAL 

ERQIFDFLGFQWAPILGNFLHIIW1LGLFGTIQ 

YRPRYIMVYTVWTALWVTWNVFIICFYLEVG 

GLSKDTDLMTFNISVHRSWWREHGPGCVRR 

VLPPSAHGMMDDYTYVSVTGCIVDFQYLEVI 

HSA 


1287 


2637 


A 






376 


RSRMGDKPIWEQIGSSFIQHYYQLFDNDK1 
GAJYVSFQL 


1288 


2638 


A 


10107 


1 


478 


MEEEDESRGKTEESGEDRGDGPPDRDK1 Lars 

AFILRAIQQAVGSSLQGDLPNDKDGSRCHGL 

RWRRCRSPRSEPRSQESGGTDTATVLDMATD 

SFLAGLVSVLDPPDTWVPSRLDLRPGESEDM 

LELVAEVRIGDRDPIPLPVPSLLPRLRAWRTG 

KT 


1289 


2639 


A 


10113 


237 


438 


" LLSRMPSTNRAGSLKDPEfAELFFKEDFfcRi.n 
DLREIGHGSFGAAYFARDVRTr^VVAIKKMS 

YSG 


1290 


2640 


A 


10114 


367 


856 


" RGAKAKSAVLPPGPPCSSlLlLSPPAPLirKbfU 
TEATRPTAMSKSLKKKSHWTSKVHESVIGRN 
PEGQLGFELKGGAENGQFPYLGEVKPGKVAY 
ESG SKL VSEELLLEVNETP V AGLTIRDVL AVI 
KHCKDPLRLKCVKQGES SGLLS VLPGGGTAR 
OA™"? 


1291 


2641 


A 


10116 


128 


591 


" RTIRETERRSALSCS VLKSEPLPGLQFgAS 
RRRLPGRRQVQVQEGGGSGLRAWVLAMASV 
LGSGRGSGGLSSQLKCKSKRRRRRRSKRKDK 
VSILSTFI^FKHLSPGITNTEDDDTLSTSSAE 
VKENRNVGNLAARPPPSGDRARGGATR 


1292 


2642 


A 

i 


10121 


1 


749 


ORRRFRAGLWGGHGLTDGLRKNCiUUGCSAR 

VPRVGERLRGHRCPDPLCLLLDMLFLSFHAG 

SWESWCCCCLIPADRPWDRGQHWQLEMADT 

RSVHETRFEAAVKV1QSLPKNGSFQPTNEMM 

LKFYSFYKQATEGPCKLSRPGFWDPIGRYKW 

DAWSSEGDMTKEEAMIAYVEEMKKIIETMP 

MTEKVEELLRVIGPFYEIVEDKKSGRSSDITSD 
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seq- 
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uence 
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n 

USSN 
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nucleotide 
ocation 
correspond i 
ng to first 
amino acid 
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peptide 
sequence 
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lOCdLlUII 

corresponding 

tr\ l<act amino 
ID laM allliiiv 

acid residue 
of peptide 
sequence 


Amino acid sequence (A= Alanine C-Cysteine, 
D=Aspartic Acid, EOlutamic Acid, 
^Phenylalanine, G=Glycine, H-Histidine, 
I=Iso leucine, K=Lysine, L^Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R-Arginine, S^Serine, 
T-Threoninc, V-Valinc, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 














LGNVLTSTPNAKTVNGKAESSDSGAESEEEh 

AC 


1293 


2643 


A 


10124 


2 


989 


PLMSLVRWEFVAASSAQK'ITSRLENYYMVC 

KADEKJ^QLVHFLRNHKQEKHLVFFRYSSGL 

CGRG1RDSARMCSTCACVEYYGKALEVLVK 

GVKIMCIHGKMKYKRNKEFMEFRKLQSGILV 

CTDVMARGIDIPEVNVAO.QYDPPSNASAFVH 

RCGRTARIGHGGSALVFLLPMEESYINFLAIN 

QKCPLQEMKPQRNTADLLPKLKSMALADRA 

VFEKGMKAFVSYVQAYAKHECNLIFRLKDL 

DFASLARGFALLRMPKMPELRGKQFPDFVPV 

DVNTDTIPFKDKIREKQRQKLLEQQRREKTEN 

F r,p R KTFTTCMX A WSKOKAKKK 


1294 


2644 


A 


10129 


91 


1042 


" VTM YKDCIESTGDYFLLCDAEUP WUIILIi^la 
1LGIV\TTI1XLLAFLF1^RKIQDCSQ\WVLPTQ 
i i pry evi HT FGLAF AF11ELN QQTAPVRYFLF 
GVLFALCFSCLLAHASNLVKLVRGCVSFSWT 
TTT riAIGCSLLOIlIATEYVTLIMTRGMMFVN 
MTPCQLNVDFVVLLVYYLFLMALTFFVSKAT 
FCGPCENWKQHGRLIF1TVLFSDIWVVWISML 
LRGNPQFQRQPQWDDPWCIALVTNAWVFL 
LLYIVPELCILYRSCRQECPLQGNACPVTAYQ 
HSFOVENOELSRDKWKVLLNSDFLSHSGA 


1295 


2645 


A 


10133 


376 


518 


r PR VVTHN SOWCFLPODHPGWLPGQSG APG 
GRGAPROEGPGSSWRQV 


1296 


2646 


A 


10135 


3 


551 


EWSLDPFMGIMSGQVGDLSPSQEKSLA^hKH 

NIQDVLSALPNPDDYFLLRWLQARSFDLQKS 

EDMLRKHMEFRKQQDLANILAWQPPEVVRL 

YNANGICGHDGEGSPVWYHIVGSQDPKGLLL 

SASKQELLRDSFRSCELLLRECELQSQKLGKR 

VEKHAIFGLEGLGLRDLWKPGIELLQE 


1297 


2647 


A 


10138 


48 


407 


MVSSCCGSVCSDOGCGQDLCQETCCRPSCLfc 
TTCCRTTCCRPSCCVSSCCRPQCCQSVCCQPT 
CSRPSCCQTTCCRTTCYRPSCCVSSCCRPQCC 
OPVCCQPTCCRPSCCETTCCHPXCC 


1298 


2648 


A 


10156 


94 


453 


■ ■ GGNRKSAEMFSQVPRTFASGCYYLNSM'ireu 
OEKm.RFDQTTRRSPYRMSRILARHQLVTKI 
QQEIEAKEACDWLRAAGFPQYAQLYEDSQFP 
INWAVKNDHDFLEKDLGEPLCRRLNT 


1299 


2649 


A 


10161 


1 


393 


" PRFSELVDGRGRVSARFGGSPSKAATVRSQl'l 
ASAQLENMEEAPKRVSLALQLPEHGSKD1GN 
VPGNCSENPCQNGGTCVPGADAHSCDCGPGF 
KGRRCEI^CrKVSRPCIKLFSETKAFPVWEGG 
vruuv 


1300 


2650 


A 


10162 


98 


391 


" AK1ASLERIMPANYTCTRPDGDNTDFRYK1 Y A 
VTYTGILGPGL1GNILALWVFYGYMKETKRA 
VlFMrNLAIADLLQVLSLPLRIFYYLKHDWPF 

VPV 


1301 


2651 


A 


10165 


1 


" 7545 


" pGIRVGITSQTGLSSNLQENCSKLAFlbSHUit 
KQLQCMPMEGRGRAS S SISDLQGKGFEKGTG 
EKHVPGVGSARHSPQASAGGSPWQRGKAQT 
RWI.GKPDPGRKRRRGSPQEEGGLRVSAAAR 
LLCSGANRCKVL VKQNb J rN l avhtj a r 
PSRPLPQAGRCLVAPLRPHPDWVAAKTLAKA 
LRAPGKPWRLAAPSPLGDLGAPGLPGPSTAP 
RTLSVEEPGVECNQLCLYADVTDPVLCLGQK 
DPGVEGKHCEK£KISSSK£LKHVHAKSEPSKP 
ARRLSESLHVVDENKNESKIEREHKRRTSTPV 
IMEGVQEETDTRDVKROVERSE1CTEEPQKQ 
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peptide 
seq- 
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Predicted 
beginning 

location 
correspondi 
ng to first 
amino acid 
residue of 
peptide 
sequence 
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nucleotide 
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to last amino 
acid residue 
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sequence 


Amino acid sequence (A=Alaninc OCysteine, 
I>=Aspartic Acid, E-Glutamic Acid, 
F=PhenyiaJanine, G=GIycme, H=Histidine, 
I=Iso(eucine, K~Lysine, L= Leu cine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R-Arginine, S~Serine, 
T-Threonine, V-VaJinc, W-Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 














KSTLKNEKiiKJCDDSETPHLKSLLKKEVKSS 

KEKPEREKTPSEDKLSVKHKYKGDCMHKTG 

DETELHSSEKGLKVEEN1QKQSQQTKLSSDDK 

TERKSKHRNERKLSVLGKDGKPVSEYIIKTDE 

NVRK£NNKKERRLSAEKTKAEHKSRRSSDSK 

IQKDSLGSKQHGITLQRRSESYSEDKCDMDST 

NMDSNLKPEEVVHKEKRRTKSLLEEKLVLKS 

KSKTQGKQVKVVETELQEGATKQATTPKPD 

KEKNTEENDSEKQRKSKVEDKPFEETGVEPV 

LETASSSAHSTQKDSSHRAKLPLAKEKYKSD 

KX)STSTRLERKJLSDGHKSRSUOiSSKDIKKKD 

ENKSDDKDGKEVDSSHEKARGNSSLMEKKL 

SRRLCENRRGSLSQEMAKGEEKLAANTLSTP 

SGSSLQRPKKSGDMTLIPEQEPMEIDSEPGVE 

NVFEVSKTQDNRNNNSHQDIDSENMKQKTS 

ATVQKDELRTCTADSKATAPAYKPGRGTGV 

NSNSEKHADHRSTLTKKMHIQSAVSKMNPGE 

KEPIHRGTTEVNIDSETVHRMLLSAPSENDRV 

QKNLKNTAAEEHVAQGDATLEHSTNLDSSPS 

LSSVTWPLRESYDPDVIPLFDKRTVLEGSTA 

STSPADHSALPNQSLTVRESEVLKTSDSKEGG 

EGFTVDTPAKASITSKRHIPEAHQATLLDGKQ 

GKVIMPLGSKLTGVIVENENITKEGGLVBMA 

KXENDLNAEPNLKQTIKATVENGKKDGIAVD 

HVVGLNTEKYAETVKLKHKRSPGKVKDISID 

VERRKENSEVDTSAGSGSAPSVLHQKNGQTE 

DVATGPRRAEKTSVATSTEGKDKDVTLSPVK 

AGPATTTSSETRQSEVALPCTSIEADEGLIIGT 

HSRNNPLHVGAEASECTVEAAAEEGGAWTE 

GFAESETFLTSTKEGESGECAVAESEDRAADL 

LAVHAVKIEANVNSVVTEEKDDAVTSAGSEE 

KCDGSLSRDSEIVEGTITFISEVESDGAVTSAG 

TEIRAGSISSEEVDGSQGNMMRMGPKKETEG 

TVTCTGAEGRSDNFVICSVTGAGPREERMVT 

GAGWLGDNDAPPGTSASQEGDGSVNDGTE 

GESAVTSTGITEDGEGPASCTGSEDSSEGFAIS 

SESEENGESAMDSTVAKEGTNVPLVAAGPCD 

DEGIVTSTGAKEEDEEGEDWTSTGRGNEIGH 

ASTCTGLGEESEGVLICESAEGDSQIGTWEH 

VEAEAGAAIVCNANENNVDSMSGTEKGSKDT 

DICSSAKGIVESSVTSAVSGKDEVTPVPGGCE 

GPMTSAASDQSDSQLEKVEDTTISTGLVGGS 

YDVLVSGEVPECEVAHTSPSEKEDEDIITSVE 

NEECDGLMATTASGDITNQNSLAGGKNQGK 

VLIJSTSTTNDYTPQVSAirDVEGGLSDALRTE 

ENMEGTRVTTEEFEAPMPSAVSGDDSQLTAS 

RS EEKDEC AMISTSIGEEFELPI S S ATTDCC AES 

LQPVAAAVEERATGPVLISTADFEGPMPSAPP 

EAESPLASTSKEEKDECALISTSIAEECEASVS 

GVA'VESENERAGTVMEEKDGSGDSTSSVEDC 

EGPVS S AVPQEEGDPS VTPAEEMGDT AM1STS 

TSEGCEAVMIGAVLQDEDRLTITRVEDLSDA 

AUSTSTAECMPISASIDRHEENQLTADNPEGN 

GDLSATEVSKHKVPMPSLIAENNCRCFUi'VK 

GGKEPGPVLAVSTEEGHNGPSVHKPSAGQGH 

PSAVCAEKEEKHGKECPEIGPFAGRGQKESTL 

HLINAEEKNVLLNSLQKEDKSPETGTAGGSST 

ASYSAGRGLEGNANSPAHLRGPEQTSGQTAK 

DSSVSSIRYLAAVNTGAIKADDMPPVQGTVA 

EHSFLPAEQQGSEDNLKTSTTKCITGQESKIAP 
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sequence 


Amino acid sequence (A=Alamne OCysteine, 
D=Aspartic Acid, E=G!utamic Acid, 
F=Phenylaianine, G=Glycine, H=Histidine, 
l=Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=<jlutamine, R-Argininc, S=Serine, 
T-Thrconinc, V-V aline, W=Tryptophan, 
Y=Xyrosine, X=Unknown ) *=Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 














SHTMIPPATYS V ALLAPKCEQDLTIKN U Y SGK 

WTDQASAEKTGDDNSTRKSFPEEGDIMVTVS 

SEENVCDlG>ffiESPLNVLGGLKLKANLKMEA 

YVPSEEEKNGEILAPPESLCGGKPSG1AELQRE 

PLLVNESLNVENSGFRTNEEIHSESYNKGEISS 

GRKDNAEAISGHSVEADPKEVEEEERHMPKR 

KRKQHYLSSEDEPDDNPDVLDSRIETAQRQC 

PPTFPHATKFENSRDLEELPKTSSETNSTTSRV 

MEEKDEYSSSETTGEKPEONDDDT1KSQE 


1302 


2652 


A 


10167 


321 


842 


EPSLFPFLRPSPARPPPRPPAPFPSPELAGPEPH 

x:\nrvxrci QWHPPKFI AKYEYMEEOVILTEKG 

NSTVAGRGTSVRCLSPSPRPLPPLLPLLADLLE 

DGFGEHPFYHCLVAEVPKEHWTPEGNPSPFP 

EARETKCYVRSSVGCVEPLTTQAEVTENLDR 

KNSQQVFKLLKKK 


1303 


2653 


A 


10171 


206 


429 


NMILLKKRRLLINSLGEGTINGLLDELLb 1 N V 
i cr^trnTTTTVVPPK'VTVinKARDLLDSVIRKGA 

RACEICITYI 


1304 


2654 


A 


10184 


970 


1524 


LCTLSPGISGTAGSCLTTEPGTELGTSFAQNCi^ 
YHEAVVLFTQALKLNPQDHRLFGNRSFCHER 
LGQPAWALADAQVALTLRPGWPRGLFRLGK 
at \Am riorcp a a AVFOFT1 RGGSOPDAAREL 
RSCLLHLTLQGQRGG1CAPPLSPGALQPLPHA 
ELAPSGLPSLRCPRSTALRSPGLSPLLH 


1305 


2655 


A 


10194 


2 


394 


TDLLGRRFRVDGAAMAACEGRRSGALGSSQ 
SDFLTPPVGGAPWAVATTVVMYPPPPPPPHR 
TMTTO\rrT QPriP^YnN^K^WRRRSCWRKWKOL 
SRLQRNMILFLLAFLLFCGLLFYTNLADHWKG 
rpNTPT 


1306 


2656 


A 


10195 


1 


410 


TPr»qrr<;T FOPr SKWTNVMKGWQYRWFVLD Y 

NAGLLSYYTSKDKMMRGSRRGCVKLRGAVI 

GIDDEDDSTFTITVDQKTFHFQARDADEREK 

WIHALEET1XRHTLQLQVRVFTWFPDSSLVGA 

FFFWLVSGFFFK 


1307 


2657 


A 


10205 


85 


308 


r,r T T P<?TUVK1 fiPSFSGICPGKDPGDOIXjAAM 
DSVPLISPLDISQLQPPLPDQW1KTQTEYQLS 

SPDQQNYTKSR 


1308 


2658 


A 


10214 


2 


453 


ECGGIRQPGPGPPPALAS APAATMNRV 0031^ 
AAANYLLCTNCRKVLRKDKRIRVSQPLTRGP 
SAFIPFKEWOANTVDERTNFLVEEYSTSGRL 
DNITQVMSLHTQYLESFLRSQFYMLRMDGPL 
PLPYRHYIAIMAAARHQCSYLINM 


1309 


2659 


A 


10233 


45 


421 


R OWPEOOSTGRPRDVARQPRCQKEEGRRLKP 
RALESRTFQGSERSRWGPPLESTKENVQCGH 
RPAFPNSSWLPFHERLQVQNGECPWQVSIQM 
SRKHLCGGSILHWWWVLTAAHCFRKTLLDM 

AV 


1310 


2660 


A 


10241 


243 


442 


" AFQLFNAKCESAFLSKRNPLQRNWTVL Y kkk. 
HKKGQSAEIQKXRTOIAFKFQRAITGASLADI 

MAK 


1311 


2661 


A 


10261 


751 


176 


LPGADYGGGHLSLRLFHLLLTSAAWVFDfc^ 

VTLNSAlCVLSTVLIMEFTDLGrCHCSEK-rCKQ 

LDFLPVKCDACKQDFCKDHFPYAAHKCPFAF 

QKDVHVPVCPLCNTTIPVKKGQIPDVVVGDHI 

DRDCDSHPGKKKEKIrTYRCSKEGCKKKEML 

QMVCAQCHGNFCIQHRHPLDHSCRHGSRPTI 

KAG 


1312 


2662 


A 


10270 


3 


""669 


STSSDEG SPS A STPMINK'I GFKFS AEKP VLE V ¥ 
SMTILDKKDGEQAKALFEKVRKFRAHVEDSD 
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SEQ ID I 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 

hod 


SEQ 
ID NO: 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alaninc C=€ysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, OGlycine, H=Histidine, 
l=lsoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
QKjlutamine, R-Arginine, S-=Serine, 
T=Threoninc, V^Valine, W=Tryptophan, 
Y=Tyrosine, X=UnJcnown, *=Stop cod on, 
/=possible nucleotide deletion, V=possible 
nucleotide insertion 














LIYKLYWQTVTKTAKPIFILCYTANFVNAISF 

EHVCK^KVEHLIGYEVFECTHNMAYMLKKL 

LISYISIICVYGFICLYTLFWLFRIPLKEYSFEKV 

REESSFSDIPDVK^FAFLLHMVDQYDQLYS 

KRFGVFLSEVSENKLRE1SLNHEWTFEKL 


1313 


2663 


A 


10287 


1221 


266 


GAHRVLSPAQGAQPRLRSAASVEVSMVGQR 

VLLLVAFLLSGVLLSEAAKILTISTLGGSHYLL 

LDRVSQILQEHGHNVTMLHQSGKFLIPDIKEE 

EKSYQVmWFSPEDHQKRIKiCHFDSYIETALD 

GRKESEALVKLMEIFGTQCSYLLSRKDIMDSL 

KNENYDLVFVEAFDFCSFL1AEKLVKPFVAIL 

PTTFGSLDFGLPSPLSYVPVFPSLLTDHMDFW 

GRVKKFLMFFSFSRSQWDMQSTFDNTIKEFIF 

PEGSRPVLSHLLLKAELWFVNSDCAFDFARPL 

LPNTVYIGGLMEKPDCPVPQVSEPSAFSLGFT 


1314 


2664 


A 


10288 


536 


1890 


rNVQLAKFSSTLVTFFSCDADPSALAKYVLAL 
VKKDKSEKELKALCIDQLDVFLQKETQIFVEK 
LFDAVNTKSYLPPPEQPSSGSLKVEFFPPQEK 
DIKKEEITKEEEREKKFSRRLNHSPPQSSSRYR 
ENRSRDERKKDDRSRKRDYDRNPPRRDSYRD 
RYNRRRGRSRSYSRSRSRSWSKERLRERDRD 
RSRTRSRSRTRSRERDLVKPKYDLDRTDPLEN 
NYTPVSSVPSISSGHYPVPTLSSTITVIAPTHHG 
NNTTESWSEFHEDQVDHNSYVRPPMPKKRC 
RDYDEKGFCMRGDMCPFDHGSDPWVEDVN 
LPGMQPFPAQPPVVEGPPPPGLPPPPPILTPPPV 
NLRPPVPPPGPLPPSLPPVTGPPPPLPPLQPSG 
MDAPPNSATSSVPTVVTTGIHHQPPPAPPSLFT 
ADTYDTDGYNPEAPSITNTSRPMYRHRVHPR 
AKLG 


1315 


2665 


A 


10293 


447 


1331 


SHPLLSCPEKVSAKLRAAAEAAAEERRTRGA 

GSRGICAGLRSVAPGPEPLKQEEGRREWGSSI 

GTP SPCG S AQ AAAAAAAEEATEKIP ALRP ALL 

WALLALWLCCATPAHALQCRDGYEPCVNEG 

MCVTYHNGTGYCKCPEGFLGEYCQHRDPCE 

KNRCQNGGTCVAQAMLGKATCRCASGFTGE 

DCQYSTSHPCFVSRPCLNGGTCHMLSRDTYE 

CTCQVGFTGRNPKCPGGNLNYQFNGIIWYS 

GG S VPPS GTKTSKP AEHN AM GTG SKNF AS GT 

LWVMVSGATSTSTSTL 


1316 


2666 


A 


10294 


118 


572 


" SLSMESNHKSGDGLSGTQKEAALRALVQRTG 
YSLVQENGQRKYGGPPPGWDAAPPERGCEEFI 
GKLPRDLFEDELIPLCEKIGKIYEMRMMMDF 
NGNNRGYAFVTFSNKVEAICNAIKQLNNYEIR 
NGRLLGVCASVDNCRLFVGGIPKTKK 


1317 


2667 


A 


10301 


158 


1956 


LLKSCGVLLSGVCIPCEGKGPTVLVIQTAVPy 

DRPTKSSMRSAAKPWNPAIRAGGHGPDRVRP 

LPAASSGMKSSKSSTSLAFESRLSRLKRASSE 

DTLNKPGSTAASGVVRLKKTATAGAISELTES 

RLRSGTGAKm'KRTGIPAPREFSVTVSRERSV 

PRGPSNPRKSVSSPTSSKTPTPTKHLRTPSTXP 

KQENEGGEKAALESQVRELLAEAKAKDSEIN 

RLRSELKX YKEKRTLN Abu 1 DALur in v laj i o 

VSPGDTEPMIRALEEKNKNFQKELSDLEEENR 

VLKEKLIYLEHSPNSEGAASHTGDSSCPTSITQ 

ESSFGSPTGNQLSSDIDEYICKNIHGNALRTSG 

SSSSDV7XASLSPDASDFEHITAETPSRPLSSTS 

NPFKSSKCSTAGSSPNSVSELSLASLTEKIQKM 

EENHHSTAEELQA1XQELSDQQQMVQELTAE 
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SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Viet 
hod 


SEQ 
LD NO: 
in 

USSN 
09/496 
914 


Predicted 
beginning 
nucleotide 
location 
conespondi 
ng to first 
amino acid 
residue of 
peptide 
sequence _ 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alamne C=Cysteine, 
D=Aspartic Acid, E=Glutaraic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, 
M-Methionine, N-=Asparagine, P=Proline, 
QKjlutaminc, R=Arginine, S=Serine, 
T=Threoninc, V=Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop cod on, 
/=possible nucleotide deletion, V=possible 
nucleotide insertion 














NEKLVDEKTILETSFHQHRERAEQLSQENtKX 
MNLLQERVKNEEPTTQEGKIIELEQKCTGILE 
QGRFEREKLLNIQQQLTCSLRKVEEENQGAL 
FMIKRI K F RNEKLNEFLELERHNNNMMA KTL 
EECRVTLEGLKMENG SLKSHLQG 


1318 


2668 


A 


10303 


333 


879 


GECFLMAAWQQNDLVFEFASNVMEDERQL 

GDPAIFPAVIVEHVPGADILNSYAGLACVEEP 

NDM1TESSLDVAEEEIIDDDDDDITLTVEASCH 

DGDETIETIEAAEALLNMDSPGPMLDEKRINN 

\nF^PFnn\/fVV APVTHVSVTLDGIPEVMETQ 

OVOEKYADSPGASSPEQPKRKKK 


1319 


2669 


A 


10322 


169 


654 


MEVRMSGSVAVTRAIAVPGLLLLLIIATALSL 
LIGAKSLPASWLEAFSGTCQSADCnVLDAR 
LPRTLAGLLAGGALGLAGALMQTLTRNPLAD 
nni t r^A/xi aaa QF A TV! f,A ALFGYSSAOEOLA 
MAFAGALVASLIV AFTGSQGGGQLSPVRLTL 

AGVXL 


1320 


2670 


A 


10323 


441 


2 


KMNQ VAW1GGGQTLG AFLCHGL AAEG Y R V 
A WDI QSDKAANVA QEIN AEYGESMA YGFG 
AnATcnncvi at QRnVDFTFGRVDLLVYSAGI 
AKAAFISDFQLGDFDRSLQVNLVGYFLCARE 
FSRLMIRDGIQGRIIQINSKSDE 


1321 


2671 


A 


10332 


1 


453 


RHRT AGPG STI S S RTD S AS AP AARAMPCE Y T Y 
AKLTSDCSRPSLQWYTRAQSKMRRPRLLLKD 
ft t vrrnvT? n Yir KLNTYTTEECDMKNMH 
YVBPDHVKRAQKYAQQVLQKESPPKFAKTS 
MALLFEHRYSVDLLPFVQKAPTDSEA 


1322 


2672 


A 


10333 


25 


423 


EPSNGPWYSALGNEDDEILLLGKDIIGTFAAS 
ccvx/TOAwnVT TFT T T FVTTSGASENASTSRGC 
GLDLLPQNVYLCDLDAIWGTWEAVAGAGA 
LITLLLMLILLGRLPFIKEKEKKSPAVLHFLFL 

LGTLG 


1323 


2673 


A 


10334 


52 


426 


SSLGNEDDEILSLAKDITGMFVASHRKMRAH 
QVLTFLLLFVITSVASENASTSRGCGLDLLPQ 
YVSLCDLDAIWGIVVEAAAGAGALITLLLMLI 
LLVRLPFFKEKEKKSPVGLHFLFLLGTLGP 


1324 


2674 


A 


10336 


1 


932 


ERLCFPCMQSKIYSYMSPNKCSGMRFPLQEb 
N S VTHHEVKCQGKPL AGIYRKREEKRN AGN 
audca yjnc QFFOKTKnARKGPLVPFPNOKSEA 
AEPPKTPPSSCDSTNAAIAKQALKKPrKGKQA 
PRKKAQGKTQQNRKXTDFYPVRRSSRKSKAE 
LQSEERKRIDELIESGKEEGMKJDLITXjKGRG 
VIATKQFSRGDFVVEYHGDLIEITDAKKREAL 
YAQDPSTGCYMYYFQYLSKTYCVDATRETN 
RLGRLINHSKCGNCQTKLHDEDGVPHLILLAS 
RDIAAGEELLYDYGDRSKASIEAHPWLKH 


1325 


2675 


A 


10338 


3 


870 


PGSTISCSELKGTQCRATAGSRGRRPPMTCWL 

RGVTATFGRPAEWPGYLSHLCGRSAAMDLG 

PMHKSYRGDREAFEETHLTSLDPVKQFAAWF 

EEAVQCPDIGEANAMCLATCTRDGKPSARML 

LLKGFGKIXjFRFinWESRKGKELDSNPFASL 

VFYWEPLNRQVRVEGPVKKLPEEEAECYFHS 

RPKSSQ1GAVVSHQSSVTPDREYLRKKNEELE 

QLYQDQEWKPKSWGGYVLYPQVMEFWQG 

QTNRLHDRIVFRRGLPTGDSPLGPNfrilRGEE 

DWLYERLAP 


1326 


2676 


A 


10344 


2 


984 

J 


" "ARAAAHCGICRLVRWWRKRRSVMG1QTSPV 
LLASLGVGLVTLLGLAVGSYLVRRSRRPQVT 
LLDPNEKYLLRLLDKTTVSHKTKRFRFALPTA 
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SEQ ID 1 S 
NO: of > 
nucl- P 
eotide s 
seq- i 
uence 


EQID * 
JO: of h 
eptide 
eq- 
lence 


let S 
od I 
i 
I 
( 
( 


>EQ 1 F 
DNO: 1 t 
q r 
JSSN 1 I 
)9/496 < 
?14 i 
i 


redictcd I 
jeguuiing i 

nirl^ntide 

ocation 
x>rrespondi 
ig to first 
amino acid 
residue of 
peptide 
sequence 


Predicted end 
lucleoude 
ocation 
;orresponding 
to last amino 
acid residue 
of peptide 
sequence 


\mino acid sequence (A- Alanine OCysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, G=Giycine, H=Histidine, 
[=lsoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, PHProline, 
Q=Glutamine, R=Arginine, S=Senne, 
T-Threonine, V-V aline, W-Tryptophan, 
Y=Tyrosinc, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, V^ossible 
n^rlPAtiHe insertion 














HHTLGLPVGKHIYLSTRIDGSL VIRPYTFV 1 bU 

EDOGYVDLVIKVYLKGVHPKFPEGGKMSQY 

LDSLKVGDWEFRGPSGIXTYTGKGHFNIQP 

NKKSPPEPRVAKKLGMIAGGTGITPMLQLIRA 

1LKVPEDPTQCFLLFANQTEKD1ILREDLEELQ 

ARYPNRFKLWFTLDHPPKDWAYSKGFVTAD 

MIREHLPAPGDDVLVLLCGPPPMVQLACHPN 

LDKLGYSQKMRFTY 


1327 


2677 


A 


10345 


1 


968 


LOS AGEG VTHVLELLESPARP V AAV 1 Q VgRR 

RYHRLSDMSMLAERRRKQKWAVDPQNTAW 

SNDD S KFGQRMLEKMG WSKGKGLG AQEQG 

ATDHIKVQVKNNHLGLGATINNEDNWIAHQ 

DDFNQLLAELNTCHGQETTDSSDKKEKKSFS 

LEEKSKISKNRVHYMKFTKGKDLSSRSKTDL 

DCIFGKRQSKKTPEGDASPSTPEENETTTTSAF 

TI OE YF AKRMAALKNKPQ VP VPG SDI SETQ VE 

RKRGlCKJmKEATGKDVESYLQPKAKRHTEG 

KPERAEAQERVAKKKSAPAEEQLRGPCWDQ 

SSKASAODAGDHVQPA 


1328 


2678 


A 


10346 


173 


439 


-GSSU4KVKKCWNGYATWL^ 
CRMAFNGCCPDCKVPGDDCPLVWGQCSHCF 
HMHCILKWLHAQQVQQHCPMCRQEWKFKE 


1329 


2679 


A 


10351 


3 


964 


OMEPGNDTQISEFLLLUFSQEPGLQPb'LhWi. 

SMYL VTVLGNLLI1L ATISDSHLHTPMYFFL SN 

LSFADICVTSTTIPKMLMNIQTQNKVTFYIACL 

MQMYFF1LFAGFENFLLSVMAYDRFVAICHP 

LHYMV1MNPHLCGLLVLASWTMSALYSLLQI 

LMWRLSFCTALEIPHFFCELNQVIQLACSDSF 

LNHMVIYFTVALLGGGPLTGILYSYSKIISSIH 

AISSAQGKYKAFSTCASHLSWSLFYGAILGV 

YLSSAATRNSHSSATASVMYTWTPMLNPF1 

Y sr FFvnr^RAT.fiTHLLWGTMKGQFFKKCP 


1330 


2680 


A 


10352 


34 


2573 


IPFLKSCCCCCLFDFPPPPLDQVQEEbCEVERV 

TEHGTPKPFRKFDSVAFGESQSEDEQFENDLE 

TDPPNWQQLVSREVLLGLKPCEIKRQEVINEL 

FYTCRAHVRTLKVLDQVFYQRVSREG1LSPSE 

LRKIFSNLEDILQLHIGLNEQMKAVRKRNETS 

VIDQIGEDLLTWFSGPGEEKLKHAAATFCSNQ 

PFALEMIKSRQKKDSRFQTFVQDAESNPLCRR 

LQLKDnPTQMQRLTKYPLLLDNIATYTEWPT 

EREKVKKAADHCRQILNYVNQAVKEAENKQ 

RLEDYQRKLDTSSLKLSEYPNVEELRNLDLTK 

RKMIHEGPLVWKVNRDKTIDLYTLLLEDILV 

LLQKQDDRLVLRCHSKILASTADSKHTFSPV1 

KLSTVLVRQVATDNKALFVISMSDNGAQIYE 

LVAOTVSEKTVWQDLICRMAASVKEQSTKPI 

PLPQSTPGEGDNDEEDPSKLKEEQHG1SVTGL 

OSPDRDLGLESTLISSKPQ SHSLSTSGKSEVRD 

LFVAERQFAKEQHTDGTLKEVGEDYQIAIPDS 

HLPVSEERWALDALRNLGLLKQLLVQQLGLT 

EICSVQEDWQHFPRYRTASQGPQTDSVIQNSE 

NIKAYHSGEGHMPFRTGTGDIATCYSPRTSTE 

Q F A PR D S VGL APOD SO ASNIL VMDHMIMTPE 

MPTMEPEGGLDDSGEHFFDAREAHSDENPSE 

GDGAVNKEEKDVNLRISGNYLILDGYDPVQE 

SSTDEEVASSLTLQPMTGIPAVESTHQQQHSP 

ONTHSDGAISPFTPEFLVQQRWGAMEYSCFEI 

QSPSSCADSQSQIMEYIHKIEADLEHLKKVEE 

SYTILCQRLAGSALTDKHSDKS 
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SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 

hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A= Alanine C=Cysteine, 
D=Aspartic Acid, E=G!utamic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
I=lsoleucine, K.=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P^Proline, 
Q=Glutamine, R=Arginine, S=Serinc, 
T=Threoninc, V=V aline, W«Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, \=possible \ 
nucleotide insertion 


1331 


2681 


A 


10353 


1 


2100 


AVEFAEGALTMAPWPELGDAQPNPDKYLEG 

AAOQQPTAPDKSKETNKTDNTE.APVTK1ELLP 

SYSTATLIDEPTEVDDPWNLPTLQDSGIKWSE 

RDTKGKILCFFQGIGRLILLLGFLYFFVCSLDIL 

SSAFQLVGGKMAGQFFSNSSrMSNPLLGLVIG 

VLVTVLVQSSSTSTSIVVSMVSSSLLTVRAAIP 

IIMGAN1GTSITNTIVALMQVGDRSEFRRAFA 

GATVHDFFNWLSVLVLLPVEVATHYLEIITQL 

IVESFHFKNGEDAPDLLKVITKPFTKI.IVQLDK 

KV1SQIAMNDEKAKNKSLVKIWCKTFTNKTQ 

INVTVPSTANCTSPSLCWTIXjIQNWTMKNVT 

YKEN1AKCQHIFVNFHLPDLAVGTILLILSLLV 

LCGCL1MIVKILGSVLKGQVATVTKKTINTDFP 

FPFAWLTGYLA1LVGAGMTFIVQSSSVFTSAL 

TP1 iniGVTTrFRAYPLTLGSNlGTTTTAILAAL 

ASPGNALRSSLQLAlLCHFFFNISGILLWYPIPFT 

RLPIRMAKGLGNISAKYRWFAVFYLI1FFFLIP 

LTVFGLSLAGWRVLVGVGVPWFIIILVLCLR 

LLQ SRCPRVLPKKJLQN WNFLPL WMRSLKP W 

DAWSKFTGCFQMRCCCCCRVCCRACCLLC 

GCPKCCRCSKCCEDLEEAQEGQDVPVKAPET 

FDNITISREAQGEVPASDSKTECTAL 


1332 


2682 


A 


10354 


30 


1377 


SQQGSQPHRQGPPSLLTAPHSLDLPALPPGPR 

GSQGKLRRVLVPMSVKPSWGPGPSEGVTAVP 

TSDLGEIHNWTELLDLFNHTLSECHVELSQST 

KRVVLFALYLAMFVVGLVENLLVICVNWRG 

SGRAGLMNLYILNMAIADLGIVLSLPVWMLE 

VTLDYTWLWGSFSCRFTHYFYFVNMYSSIFF 

LVCLSVDRYVTLTSASPSWQRYQHRVRRAM 

rArTlWVLSAIIPLPEWHIOLVEGPEPMCLFM 

APFETYSTWALAVALSTTILGFLLPFPLLTVFN 

VLTACRLRQPGQPKSRRHCLLLCAYVAVFV 

MCWLPYHVTLLLLTLHGTHISLHCHLVHLLY 

FFYDVIDCFSMLHCVINPILYNFLSPHFRGRLL 

NAVVHYLPKDQTKAGTCASSSSCSTQHSIIIT 

KGDSQPAAAAPHPEPSLSFQAHHLLPKTSPISP 

TQPLTPS 


1333 


2683 


A 


10358 


2 


884 


AAGAGADGREPASERASRAEPPAVAMGQND 

LMGTAEDFADQFLRVTKQYLPHVARLCLIST 

FLEDGIRMWFQWSEQRDYIDTTWNCGYLLA 1 

SSFVFLNLLGQLTGCVLVLSRNFVQYACFGLF 

GIIALQTIAYSILWDLKFLMRNLALGGGLLLL 

LAESRSEGKSMFAGVPTMRESSPKQYMQLGG 

RVLLVLMFMTLLHFDASFFSIVQNTVGTALMI 

LVA1GFKTKLAALTLVVWLFAINVYFNAFWT 

1PVYKPMHDFLKYDFFQTMSVIGGLLLWAL 

GPGGVSMDEKKKEW 


1334 


2684 


A 


10367 


59 


1562 


QAWSLQVALSPFFFPASPSNSFAAAVPQLLFP 

ELPLPHVPGQESAKRRSARRFLIMSELTKELM 

ELVWGTKSSPGLSDTIFCRWTQGFVFSESEGS 

ALEQFEGGPCAV1APVQAFLLKKLLFSSEKSS 

WRDCSQEEQKELLCHTLCDILESACCDHSGS 

YCLVSWLRGK7TEETASISGSPAESSCQVEHS 

SALAVEELGFERFHALIQKRSFRSLPELKDAV 

LDQYSMWGNKFGVLLFLYSVLLTKGTENTKN 

EIEDASEPLIDPVYGHGSQSLINLLLTGHAVSN 

VWDGDRECSGMKLLGIHEQAAVGFLTLMEA 

LRYCKVGSYLKISKIPYLDCLASETHLTVTFA 

KDMAL V APE APSEQ ARR VFQT YDPEDNGFIP 

DSIXEDVMKALDLVSDPEYINl^CCNKLDPEG 
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SEQID t 
NO: of T 
nucl- j 
cotidc s 
seq- \ \ 
uence 


>EQ ID f 
<10:of r 
>eptide 
eq- 
jcncc 


lod 1 

i 


>EQ I 
D NO: t 

Q 1 

USSN 
D9/496 


^dieted I 
beginning i 
-mrle ntide 
ocation 
;orrespondi 
np tn first 
amino acid 
residue of 
peptide 
sequence 


3 redicted end 1 t 
lucleotide 
ocation 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


\mino acid sequence (A=Alanine OCysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
F=Phcnylalaninc, G=Glycinc, H-Histidine, 
l=Isoleucine, K=Lysine, LMxucinc, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V- Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/-possible nucleotide deletion, \=possible 
nucleotide insertion 














LGULLGPFLQEFFPDQGSSGPESFTVY H Y NOL 
KQSNYNEKVMYVEGTAWMGFEDPMLQTD 
DTPIKRCLOTKWPYIELLWTTDRSPSLN 


1335 


2685 | 


A 


10375 


82 


2929 


TRTKRRLGREKAMASPPRGWGCGELLLPi«ML 

LGTLCEPGSGQIRYSMPEEiDKGSFVGNlAKD 

LGLEPQELAERGVRIVSRGRTQLFALNPRSGS 

LVTAGRIDREELCAQSPLCVVNFNILVENKM 

KIYGVEVEIIDINDNFPRFRDEELKVKVNENA 

AAGTRLVLPFARDADVGVNSLRSYQLSSNLH 

FSLDVVSGTDGQKYPELVLEQPLDREKETVH 

DLLLTALDGGDPVLSGTTHIRVTVLDANDNA 

PLFTPSEYSVSVPEN1PVGTRLXMLTATDPDE 

GINGKLTYSFRNEEEKISETFQLDSNLGEISTL 

QSLDYEESRFYLMEWAQDGGALVASAKVV 

VTVQDVNDNAPEVILTSLTSSISEDCLPGTVIA 

LFSVHDGDSGENGEIACSIPRNLPFKLEKSVD 

NYYHLLTTRDLDREETSDYNITLTVMDHGTP 

PLSTESHIPLK.VADVNDNPPNFPQASYSTSVT 

ENN PRG V SIFS VT AHDPD SGDN ARVTYSLAE 

DTFQGAPLSSYVSINSDTGVLYALRSFDYEQL 

RDLQLWVTASDSGNPPLSSNVSLSLFVLDQN 

DNTPEELYPALPTDGSTGVELAPRSAEPGYLV 

TKWAVDKDSGQNAWLSYRLLKASEPGLFA 

VGLHTGEVRTARALLDRDALKQSL WAVED 

HGQPPLSATFTVTVAVADRIPDILADLGSDCTP 

1DPEDLDLTLYLVVAVAAVSCVFLAFVIVLLV 

LRLRRWHKSRLLQAEGSRLAGVPASHFVGV 

DGVRAFLQTYSHEVSLTADSRKSHLIFPQPNY 

ADTLLSEESCEKSEPLLMSDKVDANKEERRV 

QQAPPNTDWRFSQAQRPGTSGSQNGDDTGT 

WPNNQFDTEMLQAMILASASEAADGSSTLGG 

GAGTMGLSARYGPQFTLQHVLQGELGSDYR 

QNVYIPGSNATLTNAAGKRDGKAPAGGNGN 

KKKSGKKEKK 


1336 


2686 


A 


10379 


1 


557 


" "RPRRRQPSFSCRVTVLEDPPCFRFTNSMN^tJ^ 
LAKLQAQVRIGGKGTARRKKKWHRTATAD 
DKKLQSSLKJCLAVNN1AGIEEVNMIKDDGTV1 
HFNNPKVQASLSANTFAITGHAEAKPITEMLP 
GILSQLGADSLTSLRKLAEQFPRQVLDSKAPK 
PEDIDEEDDDVPDLVENFDEASKNEAN 


1337 


2687 


A 


10380 




1263 


" "IPGSTISWSPAAARGLSVCRCCRLHPASAMDL 
FGDLPEPERSPRPAAGKEAQK GPLLFDDLPPA 
S STDSGSGGPLLFDDLPP ASSGDSGSLATSISQ 
MVKTEGKGAKRKTSEEEKNGSEELVEKKVC 
KASSVIFGLKGYVAERKGEREEMQDAHVILN 
DITEECRPPSSLITRVSYFAVFDGHGGIRASKF 
AAQNLHQNLIRKFPKGDVISVEKTVKRCLLD 
TFKHTDEEFLKQ AS SQKP A WKDGSTATC VLA 
VDN1LY1ANLGDSRA1LCRYNEESQKHAALSL 
SKEHNPTQYEERMR1QKAGGNVRDGRVLGV 
LEV SRS 1 GDGQ YKRCG VTS VPDIRRCQLTPND 
RFILLACDGLFKVFTPEEAVNFILSCLEDEKJQ 
TREGKSAADARYEAACNRLANKAVQRGSAD 

NVTVMWRJGH 


1338 


2688 


A 


10385 


3 


+589~ 


1 (3PSQSMAAGELEGGKPLSGLLNALAQDTFHG 
YPGITEELLRSQLYPEVPPEEFRPFLAKMRGIL 
KS1ASADMDFNQLEAFLTAQTKKQGGITSDQ 
AAV1SKFWKSHKTKIRESLMNQSRWNSGLRG 
LSWRVDGKSQSRHSAQIHTPVAIIELELGKYG 
QESEFLCLEFDEVKVNQILKTLSEVEESISTLIS 
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SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 

IDNO: 

in 

USSN 
09/4% 
914 


Predicted 

[beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, 
l>Aspartic Acid, E-Glutarnic Acid, 
F-PhenylaJanine, G=Glycinc H-Histidine, 
i=Uoleucine K=Lvsine, L=Leucine, 
M-Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X^Unknown, *=Stop codon, 
/=possible nucleotide deletion, \-=possible 
nucleotide insertion 
QPN 


1339 


2689 


A 


10386 


50 


390 


LGAMA3CHHPDLIFCRKQAGVA1GRLCEKCDG 
KCVICDSYVRPCTLVRJCDECNYGSYQGRCV1 
CGGPGVSDAYYCKECTIQEKDRDGCPKJVNL 
GSSKTDLFYERKKYGFKKR 


1340 


2690 


A 


10388 


113 


3472 


S QLRKGASATH S SPSRTDCIAQMMDI YVCLK 

RPSWMVDNKRMRTASNFQWLLSTFILLYLM 

NQVNSQKKGAPHDLKCVTNNLQVWNCSWK 

APSGTGRGTDYEVCIENRSRSCYQLEKTSIKIP 

ALSHGDYEITINSLHDFGSSTSKFTLNEQNVSL 

IPDTPEILNLSADFSTSTLYLKWNDRGSVFPHR 

SNVIWEIKVLRKESMELVKLVTHNTTLNGKD 

TLHHWSWASDMPLECA1HFVEIRCYIDNLHFS 

GLEEWSDWSPVKNISWDSQTKVFPQDKVIL 

VGSDITFCCVSQEKVLSAL1GHTNCPLIHLDGE 

NVAIKJRNISVSASSGTNVVFTTEDKIFGTVIF 

AGYPPDTPQQLNCETHDLKEnCSWNPGRVTA 

LVGPRATSYTLVESFSGKYVRLKRAEAPTNES 

YQLLFQMLPNQEIYNFTLNAHNPLGRSQSTIL 

VNITEKVYPHTPTSFKVKDINSTAVKLSWHLP 

GNFAKTNFLCEIEIKKSNSVQEQRNVTIKGVE 

NSSYLVALDKLNPYTLYTFRIRCSTETFWKW 

SKWSNKKQHLTTEASPSKGPDTWREWSSDG 

KNLIIYWKPLPINEANGKILSYNVSCSSDEETQ 

SLSEIPDPQHKAEIRLDKNDY1ISWAKNSVGS 

SPPSKIASMEIPNDDLKJEQWGMGKGILLTW 

HYDPNMTCDYVIKWCNSSRSEPCLMDWRKV 

PSNSTETVIESDEFRPGIRYNFFLYGCRNQGY 

QLLRSM3GYIEELAPIVAPNFTVEDTSADSILV 

KWEDIPVEELRGFLRGYLFYFGKGERDTSKM 

RVLESGRSDIKVKNITD1SQKTLRIADLQGKTS 

YHLVLRAYTDGGVGPEKSMYWTKENSVGL 

ILAILIPVAVAVIVGVVTSILCYRKREWIKETFY 

PDEPNPENCKALQFQKSVCEGSSALKTLEMNP 

CTPNNVEVLETRSAFPKIEDTEIVSPVAERPEN 

RSD AKPENHVVES YCPPIIEEEIPNP AADETGG 

TAQVIYIDVQSMYQPQAKPEEEQENDPVGGA 

GYKPQMHLPINSTVEDIAAEEDLDKTAGYRP 

QANVNTWNLVSPDSPRSIDSNSEIVSFGSPCSI 

NSRQFLIPPKDEDSPKSNGOGWSFTNFFQNKP 

NT) 


1341 


2691 


A 


10392 


1 


5057 


' MLPPKHLSATKPKKSWAPNLYELDSUL lKi^ 
DVnGEGPTDSEFFHQRFRNLIYVEFVGPRKTL 
HCLRNLCLDWLQPETRTKEEIIELLVLEQYLT1I 
PEKLKPWVRAKKPENCEKLVTLLENYKEMY 
QPEGESLHGVLWSAGLRCPLGLSASTLLTW 
SGLDNSLSWAAVGMSCVLWDIELHHDFLGV 
ATKSVSTHAQGDAAQGLGGTIVRMWARDSN 
LATGVLLDDNNSDVTSDDDMTRNRRESSPPH 
SVHSFSGDRDWDRRGRSRDTEPRDRWSHTR 
NPRSRMPPRDLSLPVVAKTSFEMDREDDRDS 
RAYESRSQDAESYQNWDLAEDRKPHNTIQD 
KMEhmiKLLSLGVQLAEDDGHSHMTQGHSS 
RSKRSAYPSTSRGLKTMPEAKKSTHRRGICED 
ESSHGVIMEKFIKDVSRSSKSGRARESSDRSQ 
RFPRMSDDNWKDISLNKRESVIQQRVYEGNA 
FRGGFRFNSTLVSRKRVLERKRRYHFDTDGK 
GSIHDQKGCPRKKPFECGSEMRKAMSVSSLS 
SLSSPSFTESQPrDFGAMPYVCXJECGRSFSVIS 
EFVEHQIMHTRENLYEYGESFIHSVAVSEVQK 
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SEQ ID I 
NO: of T 
nucl- I 
cotidc i 
seq- ' 
uence 


>EQID t 
vlO: of t 
)eptide 
>eq- 
ience 


Act | < 
lod I 
i 
1 

< 


jEQ ! 1 
D NO: it 
n 1 r 
JSSN 1 
D9/496 ( 


Predicted 1 I 
>eginning i 
lucleotide 
ocation 
iorrespondi 
ng to first 
amino acid 
residue of 
peptide 
sequence 


Predicted end 
lucleotide 
ocation 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A= Alanine C-Cysteinc, 
D=Aspartic Acid, E=Glutamic Acid, 
^Phenylalanine, G=Glycine, H=Histidine, 
l=lsoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S-Serine, 
T=Threonine, V=Valine, W-Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, V=possible 
n^rl^ntide insertion _ 














S Q V GOKRFECKDCGbTF N KS AAL AEHKKJHA 

RGYLVECKNQECEEAFMPSPTFSELQKIYGK 

DKfYECRVCKETFLHSSALIEHQKIHFGDDKD 

NEREHERERERERGETFRPSPALNEFQKMYG 

KEKJVIYECKVCGETFLHSSSLK£HQKIHTRGN 

PFENKGKVCEETFIPGQSLKRRQKTYNKEKLC 

DFTDGRDAFMQSSELSEHQKIHSRKNLFEGR 

GYEKSV1HSGPFTESQKSHT1TRPLESDEDEKA 

FnSSNPYENQKlPTKENVYEAKSYERSVIHSL 

ASVEAQKSHSVAGPSKPKVMAESTIQSFDAIN 

HQRVRAGGNTSEGREYSRSVIHSLVASKPPRS 

HNGNELVESNEKGESSIYISDLNDKRQKIPAR 

ENPCEGGSKNRNYEDSVIQSVFRAKPQKSVP 

GEG SGEFKKDGEFS VPS SNVRE YQKARAKKK 

YIEHRSNETSVIHSLPFGEQTFRPRGMLYECQ 

ECGECFAHSSDLTEHQKJHDREKPSGSRNYE 

W S VTRSL APTDPQTS Y AQEQ Y AKEQ ARNKCK 

DFRQFFATSEDLNTNQKJYDQEKSHGEESQGE 

NTDGEETHSEETHGQETIEDPVIQGSDMEDPQ 

KDDPDDKIYECEDCGLGFVDLTDLTDHQKVH 

SRKCLVDSREYTHSVIHTHSISEYQRDYTGEQ 

LYECPKCGESFIHSSFLFEHQRIHEQDQLYSM 

KGCDDGFIALLPMKPRRNRAAERNPALAGSA 

IRCLLCGQGFMSSALNEHMRLHREDDLLEQS 

QMAEEAIIPGLALTEFQRSQTEERLFECAVCG 

ESFVNPAELADHVTVHKNEPYEYGSSYTHTS 

FLTEPLKGAIPFYECKDCGKSFIHSTVLTKHKE 

i iTT crcrnnPAA AAA A A A AOEVEANVHVPO 

WLRIQGLNVEAAEPEVEAAEPEVEAAEPEV 

EAAEPNGEAEGPDGEAAEPIGEAGQPNGEAE 

QPNGDADEPDGAGIEDPEERAEEPEGKAEEPE 

GDADEPDGVGIEDPEEGEDQEIQVEEPYYDC 

HECTETFTSSTAFSEHLKTHASMIIFEPANAFG 

ECSGYIERASTSTGGANQADEKYFKCDVCGQ 

LFNDHLSLARHQNTHTG 


1342 


2692 


A 


10393 


2 


1350 


GRPRSSSDNRNFLRERAGLSSAAVQTR1UNSA 

ASRRSPAARPPVPAPPALPRGRPGTEGSTSLS 

APAVLWAVAWWWSAVAWAMANYIHV 

PPGSPEVPKLNVTVQDQEEHRCREGALSLLQ 

HLRPHWDPQEVTLQLFTDGITNKLIGCYVGN 

TMEDWLVRIYGNKTELLVDRDEEVKSFRVL 

OAHGCAPQLYCTFNNGLCYEFIQGEALDPKH 

VCNP AIFRLIARQLAKJHAIHAHNGWIPKSNL 

WLKMGKYFSUPTGFADEDINKRFLSDIPSSQI 

LQEEMTWMKEILSNLGSPVYLCHNDLLCKNII 

YNEKQGDVQFIDYEYSGYNYLAYDIGNHFNE 

FAGVSDVDYSLYPDRELQSQWLRAYLEAYK 

EFKGFGTEVTEKEVEILFIQVNQFALASHFFW 

GLV/ALIQAKYSTIEFDFLGYAIVRFNQYFKM 

KPEVTALKVPE 


1343 


2693 


A 


10394 


102 


839 


PEAQTSAVLAR£KGHLPTMRHEAPMQMAbA 

QDARYGQKDSSDQNFDYMFKLLUGNSSVGK 

TSFLFRYADDSFTSAFVSTVGIDFKVKTVFKN 

EKRIKLQfWDTAGQERYR 1 ITTAYYRGAMGFI 

LMYDITNEESFNAVQDWSTQDCTYSWDNAQ 

VILVGNKCDMEDERVISTERGQHLGEQLGFE 

FFETSAKDN1NVKQTFERLVDDCDKMSESLET 

DPAITAAKQNTRLKETPPPPQPNCAC 


1344 


2694 


A 


10395 


2 


4136 


" DRPPWNSRVDDFVTNLLHLSSKGHISPAKDTS 
LQQRTPAEMSPVLHFYVRPSGHEGAASGHTR 
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SEQID S 
NO: of h 
nucl- P 
eotidc s 
scq- i 
uence 


EQ ID N 
JO: of h 
eptide 
eq- 
lence 


let S 
od I 

i 
I 

i C 
c 


EQ F 
D NO: fc 
n r 
JSSN 1 
)9/496 ( 
)14 i 
i 


Predicted F 
eginning r 
ucJeotide 1 
ocation c 
correspond i t 
ig to first 
imino acid 
xsidue of 
peptide 
sequence 


Ycdictcd end / 
ucleotide ] 
ocation 
corresponding 
o last amino 
jcid residue 
Df peptide 
sequence 


^mino acid sequence (A-Alanme C=Cysteme, 
3=Aspartic Acid, E=Glutamic Acid, 
^Phenylalanine, G=Glycine, H=Histidine, 
=Isoleucine, K=Lysine, L=Uucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T-Threonine, V=Valine, W=Tryptophan, 
Y-Tyrosine, X-Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possible 
niirlpottf* insertion _ 






! 








RXLQGKLPELQG VETELC Y N VN WTAEALPS A 

EETKJO.MWLFGCPLLLDDVARESWLLPGSN 

DLLLEVGPRLNFSTPTSTN1VSVCRATGLGPV 

DRVETTRRYRLSFAHPPSAEVEAIALATLHDR 

MTEQHFPHPIQSFSPESMPEPLNGPINDLGEGR 

LALEKANQELGLALDSWDLDFYTKRFQELQR 

NPSTVEAFDLAQSNSEHSRHWFFKGQLHVDG 

QKLVHSLFESIMSTQESSNPNNVLKFCDNSSA 

IQGKEVRFLRPEDPTRPSRFQQQQGLRHWFT 

AETHNFPTGVCPFSGATTGTGGRIRDVQCTG 

RG AHWAGTAG YCFGNLHIPG YNLP WEDLSF 

QYPGNFARPLEVAIEASNGASDYGNKFGEPV 

LAGFARSLGLQLPDGQRREWIKPIMFSGG1GS 

MEADHISKEAPEPGMEVVKVGGPVYRIGVGG 

GAASSVQVQGDNTSDLDFGAVQRGDPEMEQ 

KMNRVIRACVEAPKGNPICSLHDQGAGGNG 

NVLKELSDPAGAIIYTSRFQLGDPTLNALEIW 

GAEYQESNALLLRSPNRDFLTHVSARERCPA 

CFVGTITGDRRIVLVDDRECPVRRNGQGDAP 1 

PTPPPTPVDLELEWVLGKMPRKEFFLQRKPP 

MLQPLALPPGLSVHQALERVLRLPAVASKRY 

LTNKVDRSVGGLVAQQQCVGPLQTPLADVA 

WALSHEELIGAATALGEQPVKSLLDPKVAA 

RLAVAEALTNLVFALVTDLRDVKCSGNWM 

W AAKLPGEGAALADACEAMVAVMAALG VA 

VDGGKDSLSMAARVGTETVRAPGSLVISAYA 

VCPDITATVTPDLKHPEGRGHLLYVALSPGQ 

HRLGGTALAQCFSQLGEHPPDLDLPENLVRA 

FSITQGLLKDRLLCSGHDVSDGGLVTCLLEM 

AFAGNCGLQVDVPVPRVDVLSVLFAEEPGLV 

LEVQEPDLAQVLKRYRDAGLHCLELGHTGE 

AGPHAMVRVSVNGAWLEEPVGELRALWEE 

TSFQLDRLQAEPRCVAEEERGLRERMGPSYC 

LPPTFPKASVPREPGGPSPRVAILREEGSNGDR 

EMAD AFHL AGFE VWD VTMQDLCSG AIGLDT 

ttd n V A F V GGFS Y AD VLGS AKG W AAA VTFHP 

RAGAELRRFRKRPDTFSLGVCNGCQLLALLG 

WVGGDFNEDAAEMGPDSQPARPGLLLRHNL 

SGRYESRWASVRVGPGPALMLRGMEGAVLP 

VWSAHGEGYVAFSSPELQAQIEARGLAPLHW 

ADDDGNPTEQYPLNPNGSPGGVAGICSCDGR 

HLAVMPHPERAVRPWQWAWRPPPFDTLTTS 

PWLOLFINARNWTLEGSC 


1345 


2695 


A 


10396 


65 


642 


" GVRGFW AGTMASRAGPRAAGTDGSDh QHKb 
RVAMHYQMSVTLKYEIKiaiYVHLVIWLLLV 
AKMSVGHLRLLSHDQVAMPYQWEYPYLLSI 
LPSLLGLLSFPRNNISYLVLSMISMGLFSIAPLI 
YGSMEMFPAAQQLYRHGKAYRFLFGFSAVSl 
MYLVLVLAVQVHAWQLYYSKKLLDSWFTST 


1346 


2696 


A 


10398 


1 


/ 1 0 


DDFVRCGPQSAAMGASARLLRAVlMtjArui. 

GKGTVSSRJTTHFELKHLSSGDLLRDNMLRGT 

EIGVLAKAFIDQGKLIPDDVMTRLALHELKNL 

TQYSWLLDGrPRI UVAtALUKAi^i Tin*- 
NVPFEVIKQRLTARWIHPASGRVYNIEFNPPK 
TVGIDDLTGEPLIQREDDKJ^ETVIKRLKAYED 
QTKPVLEYYQKKGVLETFSGTETNKIWPYVY 

AFLOTKVPQRSQKASVTP 


1347 


2697 


A 


10402 


153 


3 969 


KHRQENNALDMAPELKNrTGPMCLlhJN l NU^L 
VANPEALK1LSAITQPVVWAIVGLYRTGKSY 
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SEQ ID IS 
NO: of r 
nucl- p 
eotide s 
scq- i 
uence 


EQID 1 
<0. of 
>eptide 
eq- 
lence 


Viet I 5 
iod 1 

i 


5EQ 1 * 
DNO: t 
n i 
JSSN I 
39/496 < 
914 


Predicted 

)eginning 

nucleotide 

ocation 

:orrespondi 

ig to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
ocation 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=>Alajune C-Cystcine, 
D-Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine > P=Proline, 
Q=<31utamine, R=Arginine, S=Serine, 
T=Threonine, V= Valine, W=Tryptophan, 
Y=Tyrosinc, X-Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 














LMNKUVGKNKGFSLGSl V JU5H 1 KX,1 WM WCV 

PHPKKPEHTLVLLDTEGLGDVKKGDNQNDS 

WUTLAVLLSSTLVYNSMGTINQQAMDQLYY 

VTELTHRIRSKSSPDENENEDSADFVSFFPDFV 

WTLRDFSLDLEADGQPLTPDEYLEYSLKLTQ 

GTSQKDKNFNLPRLCIRKFFPKKKCFVFDLPI 

HRPXLAQLEKLQDEELDPEFVQQVADFCSY1 

FSNSKTKTLSGGIKVNGPRLESLVLTYINAISR 

GDLPCMENAVLALAQIENSAAVQKAIAHYD 

OOMGQKVQLPAETLQELLDLHRVSEREATEV 

YMKNSFKJ5VDHLFQKKLAAQLDKKRDDFCK 

QNQEASSDRCSALLQVIFSPLEEEVKAGIYSK 

PGGYCLFIQKLQDLEKKYYEEPRKGIQAEEIL 

OTYLKSKESVTDAILQTDQILTEKEKEIEVEC 

VKAESAQASAKMVEEMQIKYQQMMEEKEKS 

YOEHVKQLTEKMERERAQLLEEQEKTLTSKL 

QEQARVLKERCQGESTQLQNEIQKLQKTLKK 

KTKRYMSHKLK1 


1348 


2698 


A 


10404 


5 


892 


TOLP APLSG VLSRLQLGSG APLLTWVQb 1 AU 

V AGG APRRRTP VTM WRLL ARAS APLLRVPLS 

DSWALLPASAGVKTLLPVPSFEDVSIPEKPKL 

RFIERAPLVPKVRREPKNLSDIRGPSTEATEFT 

EGNFAILALGGGYLHWGHFEMMRLTINRSM 

DPKNMFAIWRVPAPFKPITRKSVGHRMGGGK 

GAIDHYVTPVKAGRLWEMGGRCEFEEVQG 

FLDQV AHKLPFAAKAVSRGTLEKMRKDQEE 

RERNNQKPWTFERIATANMLG1RKVLSPYDL 

THKGKYWGKFYMPKRV 


1349 


2699 


A 


10409 


59 


1184 


' LRRNCS ALGGLFQT1ISDMKG S YP VW HVF ink 
AGKLQSQLRTTVVAAAAFLDAFQKVADMAT 

NTRGGTREIG S ALTRMCMRHRS IE AKJLRQF S S 

AL1DCLINPLQEQMEEWKKVANQLDKDHAK 

EYKKARQEIKKKSSDTLKLQKXAKKGRGDIQ 

PQLDSALQDVNDK.YLLLEETEKQAVRKALIE 

ERGRFCTFISMLRPVIEEEISMLGE1THLQTISE 

DLKSLTMDPHKLPSSSEQVILDLKGSDYSWS 

YOTPPSSPSTTMSRKSSVCSSLNSVNSSDSRSS 

GSHSHSPSSHYRYRSSNLAQQAPVRLSSVSSH 

riQnPT^nnAFOSKSPSPMPPEAPNQRRKEKRE 

pnPNGGGPTTASGPPAAAEEAQRPRSM 


1350 


2700 


A 


10410 


511 


958 


AGRGGPGKPVSWSSGPGSKiQTQRRSW VRb i 

RGHSSLLPPSQDFVAGLSVILRGTVDDRLNW 

AFr^YDLNKDGClTKEEMU)IMKSIYDMMG 

KYTYPALREEAPREHVESFFQKMDRNKDGV 

VTIEEFIESCQKDEN1MRSMQLFDNV1 
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WHAT IS CLAIMED IS: 
1 . An isolated polynucleotide comprising a nucleotide sequence selected from the group 
consisting of SEQ ID NO: 1-1350, a mature protein coding portion of SEQ ID NO: 1-1350, an 
active domain of SEQ ID NO: 1-1350, and complementary sequences thereof. 

2 An isolated polynucleotide encoding a polypeptide with biological activity, wherein said 
polynucleotide hybridi.es to the polynucleotide of claim 1 under stringent hybridization 
conditions. 

3 An isolated polynucleotide encoding a polypeptide with biological activity, wherein said 
polynucleotide has greater than about 90% sequence identity with the polynucleotide of chum 1. 

4. The polynucleotide of claim 1 wherein said polynucleotide is DNA. 

5. An isolated polynucleotide of claim 1 wherein said polynucleotide comprises the 
complementary sequences. 

6. A vector comprising the polynucleotide of claim 1 . 

7. An expression vector comprising the polynucleotide of claim 1. 

8. A host cell genetically engineered to comprise the polynucleotide of claim 1. 

9 A host cell genetically engineered to comprise the polynucleotide of claim I operatively 
associated with a regulatory sequence that modulates expression of the polynucleotide in the host 
cell. 

10. Anisolatedpoly P epude,whe^ 

(a) a polypeptide encoded by any one of the polynucleotides of claim 1 ; and 

(b) a polypeptide encoded by a polynucleotide hybridizing under stringent condition, 
with any one of SEQ ID NO: 1-1 350. 

11. A composition comprising the polypeptide of claim 10 and a carrier. 

12. An antibody directed against the polypeptide of claim 10. 
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13 A method for detecting the polynucleotide of claim 1 in a sample, compnsmg: 

a) contacting the sample with a compound that binds to and forms a complex 
with the polynucleotide of claim 1 for a period sufficient to form the complex; and 

b) detecting the complex, so that if a complex is detected, the polynucleotide 

of claim 1 is detected. 

14 A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample under stringent hybridization conditions with 
nucleic acid primers that anneal to the polynucleotide of claim 1 under such conditions; 

b) amplifying a product comprising at least a portion of the polynucleotide of 

claim 1; and 

c) detecting said product and thereby the polynucleotide of claim 1 m the 

sample. 

1 5 The method of claim 14, wherem the polynucleotide is an RNA molecule and the method 
further comprises reverse transcribing an annealed RNA molecule into a cDNA polynucleotide. 

16 A method for detecting the polypeptide of claim 10 in a sample, comprising: 

a) contacting the sample with a compound that binds to and forms a complex 
with the polypeptide under conditions and for a period sufficient to form the complex; and 

b) detecting formation of the complex, so that if a complex formation is 
detected, the polypeptide of claim 10 is detected. 

17. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contactmgmecompoundwimmepolypeptideofclaimlOunder 

conditions sufficient to form a polypeptide/compound complex; and 

b ) detecting the complex, so that if the polypeptide/compound complex is 
detected, a compound that binds to the polypeptide of claim 10 is identified. 

18. A method for identifying a compound that binds to the polypeptide of claim 1 0, 
comprising: 



336 



WO 01/57188 



PCT/U SO 1/03800 



a) contacting the compound with the polypeptide of claim 10, in a cell, under 
conditions sufficient to form a polypeptide/compound complex, wherein the complex drives 
expression of a reporter gene sequence in the cell; and 

b) detecting the complex by detecting reporter gene sequence expression, so 
that if the polypeptide/compound complex is detected, a compound that binds to the polypeptide 
of claim 10 is identified. 

19. A method of producing the polypeptide of claim 10, comprising, 

a) culturing a host cell comprising a polynucleotide sequence selected from 

the group consisting of a polynucleotide sequence of SEQ ID NO: 1-1350, a mature protein 
coding portion of SEQ ID NO: 1-1350, an active domain of SEQ ID NO: 1 -1 350, 
complementary sequences thereof and a polynucleotide sequence hybridizing under stringent 
conditions to SEQ ID NO: 1,1350, under conditions sufficient to express the polypeptide in saad 

cell; and 

b) isolating the polypeptide from the cell culture or cells of step (a). 

20. An isolated polypeptide comprising an amino acid sequence selected from the group 
consisting of SEQ ID NO: 1351-2700, the mature protein portion thereof, or the active domain 
thereof. 

21. The polypeptide of claim 20 wherein the polypeptide is provided on a polypeptide array. 

22. A collection of polynucleotides, wherein the collection comprises the sequence 
information of at least one of SEQ ID NO: 1-1350. 

23. The collection of claim 22, wherein the collection is provided on a nucleic acid array. 

24. The collection of claim 23, wherein the array detects full-matches to any one of the 
polynucleotides in the collection. 

25. The collection of claim 23, wherein the array detects mismatches to any one of the 
polynucleotides in the collection. 



26. The 
format 



collection of claim 22, wherein the collection is provided in a computer-readable 
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27. A method of treatment comprising administering to a mammalian subject in need thereof 
a therapeutic amount of a composition comprising a polypeptide of claim 10 or 20 and a 
pharmaceutically acceptable carrier. 

28. A method of treatment comprising administering to a mammalian subject in need thereof 
a therapeutic amount of a composition comprising an antibody that specifically binds to a 
polypeptide of claim 10 or 20 and a pharmaceutically acceptable carrier. 
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